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Abstract 


This thesis embodies the design of a capability based 
Stack processor best-Ssuited for execution of programs in the 
newly proposed language Ada. Hence the principal emphasis 
in the thesis is on the structured development of a 
processor that reduces the semantic gap between the programs 
written in Ada and the object code produced by the 
processor. 

One of the important features in Ada, that makes it 
different from other widely used programming languages, is 
its facilities for data abstraction. Moreover a highly 
desirable characteristic of a reliable computing environment 
is that it should support efficient execution of a process 
in a number of small protection domains implemented 
according to the 'principle of least privileges'. Both the 
above features are well Supported in the proposed 
architecture through the definition of hardware. recognised 
objects called packets and tagged capabilities. 

Analysis of execution characteristics of any typical 
Ada program (on a conventional architecture) reveals that a 
considerable amount of execution time is spent in executing 
compiler generated code for Procedure Call-Return. 


Similarly a significant proportion of the execution time of 
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an Ada process, executing in 4 protection system that is 
built around the ‘principle of least privileges', is devoted 
to frequent domain switching. A new design methodology is 
proposed that facilitates the choice of efficient 
arcitectural support (hardware/firmware) for these two 
mechanisms. The methodology is based around the definition 
of a new complexity measure for exo-architectural 
components. 

A considerable portion of the thesis is devoted to the 
description of the hardware organization of the processor 
and the associated capability mechanism. The choice of 
instruc tion forms and other architectural features are 
justified through critical analysis of proposals available 
in the related literature. In absence of any usage 
Statistics for Ada, suitable statistics for other Algol like 


languages are used. 
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Chapter 1 
Introduction 

In recent years, many efforts are being made to develop 
appropriate language and system supports for large scale 
software development/maintenance and reliable software. The 
complexity of any large system is more manageable when it is 
decomposed into relatively stable subsystems. This was 
Suggested by Simon [Sim69] as a common organizing principle 
of all complex systems. In the domain of software 
engineering, similar notions of hierarchical ponpleniey 
decomposition have been advocated through the introduction 
of the concepts of Structured programming, program 
modularity and information hiding. [Dij68, Par72]. Various 
programming languages - MODULA, ALPHARD, CLU, etc., have 
been proposed to consolidate these concepts’ [Hor83]. The 
design of programming language Ada could be considered as a 
culmination of all these efforts. One of the prime 
objectives for the design of Ada was to propose a language 
that offers significant advantage in large scale software 
development. It provides excellent features for data 
abstraction and modularity. 

Lt is well understood thath twookhofectheeemajzor 
requirements for implementation of reliable software are 
[Lin76]: (a) execution of processes in small protection 
domains and (b) system structure to support implementation 
of the concept of abstract data types and information 


hradiag« 


1 agi A 2 
ea csoisais ee , 
gatevel od sbom oried srs :3210159) RS auaay tees 
etasn eote— 701 e37G0dhee were bas seayyeet 
sat .dvev¥ioe Gidaifes bea 2saehsséleeoemeyaaies: 


ot »f 4adw eladeranét otoe 5? Teteqe epaeh fie to = 


seach | erett lakes SL 
~. 

4 “« 
sigivndsa @nivinesis soni 's cs leenia} Moe ‘e 
sipetiae «© » on | wl neg eye ecalegey! 4 
75 i eoke@eiecS fasidovetsdid: 724 eotjeq, Yeiigig \ Bae 


ofSmbe379i “any 6 6|fpue'%. einem vRS, qaad ot sgh Aone 


Foe tare? Onc eGAOe Te Gaui ot 2 16 Sf PNSe 
sue das¥ 7 F “a eed . i ac} oS et m? OS i 6m” ind 90S yas 

} : § ’ a4 
f P wy , 
yee pT Br og Gua? |e Aah Ate | aeeend ph ay 


ot | btficoM] edgeddes seed) ss dGeken OY ae 
A eh twsad Leqad’sd feivd>. 36") <2 sug reat Zelyinsers ee tis 26 
omtia sie 26 SFC. 2220734 dasdy: pr ts * wets 
egaleone! s. $2eq095 93 bal nhA, 74 OB es ok, ye saves 
ea a siege ogand al ng Gshavey into: ivinete eyapetel 9 ; 
B90h NOt smaveel gistiapes 220 YOIT a romain | 

Va23sisben ia Hortae 
<ceaw 68, 80 dee dens . tocrerehhw c140 ‘a? 
e3¢@ atay tiem Sidaiiss 26 nue teateaetami 203 e3it 
ne itskaorg Claw! rts BSeeeouIg «Fé, Abi dunes iy) 
and cataewsi gas yacague wd Sovaouise disaeva id) Bes ani 
mbiteimbici-bes 2acys sieb sherzade Jo tqanges ae os 
a3 | 


This thesis embodies the design of an architecture that 
has the following objectives: 

a. provide architectural Support for efficient 
execution of programs written in Ada, 

b. provide architectural Support for elegant 
implementation of the concept .of data abstraction 
and modularity as in Ada, 

Cc. provide support for implementing the principle of 
‘least privileges'' (small protection domains) and 
flexible sharing. 

These architectural features facilitate development and 
implementation of large scale reliable software. The 
objective of the proposed design is represented in Fig 1.1 
(the arrows between the blocks are to be read as 


"supports"/"facilitates"). 


1.1 Language Ada and the Architecture 

A review of the research efforts in languageoriented 
architecture design reveals that the primary aim of such 
efforts is to reduce the semantic gap between the high-level 
language (HLL) and the architecture that would execute 
programs written in the language. Such architectures are 
generally categorized into the classes of high-level 
language architectures and language-directed 
architectures’? [Mye82]. 


‘This concept is introduced in section 1.2. nek 
2 These terms have no universally accepted definitions. 
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The characteristic feature of high-level language 
architecture is that the high-level language is either 
considered as the assembly language of the machine or 
interpreted directly by microcode or hardware. These 
architectures practically reduce the semantic gap to zero 
but have severe disadvantages and limitations that make them 
practically infeasible [Mye82, DiP80]. 

The language-directed approach to architecture design 
implies an exo-architecture’*[Das82b] that is designed with 
high-level languages in mind such that this’ thinking 
permeates all of the architectural design decisions [Mye82]. 
The operations and data structures frequently used by the 
programs written in the HLL are usually supported by 
semantically equivalent exo-architectural features. 
Essentially an attempt 1S made to evenly distribute the 
complexity of the HLL program to machine language mapping, 
between the compiler and the interpretive mechanism of the 
architecture (microprogram or hardware). The design 
proposed in this thesis is an architecture directed towards 
Ada. 

As far as architectural support is concerned, Ada is 
another block-structured Algol-like. language with additional 
facilities of modularization and data abstraction. In most 
of the previous Algol-like languages, procedure was the 
primary abstraction mechanism. The data abstraction in a 


programming language is a mechanism which encapsulates the 


> The architecture as viewed by the compiler writer. 
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representation and allowed operations of a data type. The 
representation of the type remains invisible to the user. 
An object of the type is allowed to be manipulated only by 
the operation specified by the implementor and_ specified 
within the encapsulation. Ada defines an abstract data type 
through the use of Package modules and Private type 
declarations [GoH81]. 

Numerous proposals have been made for language-directed 
or HLL architectures for block-structured languages [Mye82]. 
Among the commercial architectures, a series of machines 
from Burroughs [Dor79, Org73], MU5, ICL2900 [BiB80] series 
of machines (to name a few) could be considered to _ be 
directed towards Algol-like languages. So the obvious 
question arises: what are the new features in the proposed 
architecture that makes it distinct from the earlier 
designs? The following paragraphs would be indicative of 
some of the distinctive features of the architecture. 

A feature that distinguishes Algol-like languages from 
other procedural languages is the mechanism of variable 
addressing that implements the notion of scope/visibility of 
free variables in these languages. Various techniques for 
implementing this mechanism at the » compilation level have 
been proposed in the literature. One of the earliest 
techniques for implementation of variable addressing 
mechanism is due to Dijkstra [Dij60] and one of the latest 
one is due to Tanenbaum [Tan78]. Until recently [Dep82a] no 


formal performance analysis of these proposals was reported 
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in the literature. Some of the architectures have provided 
Stipport.= for variable addressing in block-structured 
languages but designers did not provide any formal argument 
for the choice of the architectural support. Only Tanenbaum 
informally justified the support in his proposal [Tan78] by 
relating it to usage statistics of the language under 
consideration. 

A new methodology is proposed in this thesis that not 
only provides a formal basis for .performance analysis of 
these mechanisms but directly indicates the most suitable 
architectural Support for the execution Of the 
block-structured language under consideration. The two new 
complexity measures - virtual transfer complexity and real 
transfer complexity, proposed in this thesis, provide the 
basis for the methodology. 

It should be noted that a Significant amount of 
execution time is spent in management of the 
block-structured environment for enforcing the 
scope/visibility rules of these languages. The choice of 
the variable addressing mechanism directly influences’ the 
complexity of implementation of Pre Call, Pre.entry, Post 
return and Post exit sequences necessary for the maintenance 
of the block-structured environment. The high frequency of 
usage of procedure call-return and block entry-exit in 
structured programs is well known [Dep82b, Tan78, Mye82]. 
It should be obvious from the above discussion that the 


design of the architectural support for variable addressing 
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plays a key role in determining the efficiency of a 
language-directed architecture for a block-structured 
language like Ada. The above mentioned methodology has been 
used in the design of the architectural feature for the 
variable addressing mechanism in the proposed design. 

Ada supports separate compilation of program units to 
facilitate large scale software development. The program 
units in Ada are - subprograms, packages and tasks[GoH81]. 
In this proposal, the storage is not viewed as a linear 
sequence of words; rather it is viewed as a set of objects. 
The word '‘'object' signifies a group of related storage 
elements with the same lifetime. A similar view of the 
storage has been adopted in many of the recent proposals for 
new architectures [Mye82]. The compiled version of the 
above mentioned program unitS are represented at _ the 
architectural level as hardwire/firmware recognized objects 
called Packets. A packet can encapsulate the compiled 
version of a package, one or more subprograms. The packet 
has some similarities with the module object in the SWARD 
architecture [Mye82]. A closer look at the functional 
characteristics reveals the superiority and the. distinctive 
features of the proposed object in the context of Ada. 
Architectural support for multitasking has not been included 
in this preliminary version of the design. 

Some of the new features in Ada that are supported by 
the proposed design are indicated below. 


1. Ada provides facilities for data abstraction through the 
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use of packages and private type declarations. The 
proposed architecture provides a new and elegant 
mechanism for implementation of abstract data types. 

2. The language allows dynamic arrays and discriminant 
records. The actual representation of these composite 
Structures might not be known until the execution time. 
The architecture has adequate features for 
representation and manipulation of such objects (which 
includes support for run time subscript bound checking). 

3. Ada allows declaration of subtypes and definition of 
Subranges. Furthermore subranges could be determined at 
run time. The architecture introduces new primitive 
data types and instructions’ that allow efficient 
implementation of run time constraint checking. 

4, The semantics of the parameter mechanism in Ada and the 
necessity of dynamic type checking pose a new _ problem 
for the implementors of the language. The proposed 
design incorporates an unconventional but efficient 
feature in the architecture that facilitates the 


handling of Ada parameters. 


1.2 Capability based addressing 

The concepts of 'capability' and capability based 
protection were introduced by Dennis and VanHorn in their 
classic paper on "Programming Semantics for Multiprogrammed 
Computations"[Dev66]. A 4 capabrhity is ¢gaktpaicy tears 
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and set of access rights (eg., read data, write data, read 
capability, enter etc.) for x [Den82, Mye82, Lin75, SaS75]. 
The capability is a ticket in a sense that possession of the 
capability unconditionally authorizes the holder r-access to 
xs Various systems have used capabilities in quite 
different ways, but a capability representation would 
generally have the following attributes: 

a. a capability identifier, representing a system-wide 
unique name for an object (often the identifier has been 
loosely used as a 'capability' in the literature) and 

b. a set of access rights that the capability allows 
to the object that it names. 

As mentioned earlier, the storage in this architecture 
is viewed as a set of objects. These objects are created 
via machine instructions and are named at the time of 
creation. The mames are returned to the creating process. 
These names/ identifiers represent logical addresses of the 
objects. The term capability as used in this thesis 
represents an occurrence of these names. Moreover it is 
ensured that these names are system-wide unique and are 
never reused in the lifetime of the system. 

The principle of capability.based protection implies 
that a process could access an object if and only if it has 
the ‘capability’ for the object with appropriate access 
EEghbs i! tThiussedeadsretoimthessnotiontcof! ocapabilatytebased 
addressing [Fab74]. In a system using capability based 


addressing, a table is maintained in the system that 
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contains the information required to translate a logical 
address (capability identifier) to a physical address in the 
primary memory. When a system integrates capabilities into 
the hardware EOL memory addressing mechanism, the 
architecture 1s generally categorized as a 'capability 
architecture’. In a capability architecture a capability is 
interpreted on each reference to primary memory. 

A protection domain is an independent local address 
Space defining the total set of addresses that can be 
formulated by a set of instructions [Mye82]. The well-known 
"principle of least privileges' propounded as a desirable 
characteristic for protection models for secure and reliable 
computation indicates that a program should have access to 
only those objects that are necessary for successful 
execution of the program [Dev66, SaS75, Lin76, Den82, 
Mye82]. This principle dictates that a process should 
execute in a number of small protection domains. Usually a 
protection domain is associated with a protected procedure 
[DevV66, Lin76, GrD72] and a process is executed by calling 
these protected procedures. The instruction representing a 
call to such. ‘a ‘procedure, “during “the ‘execution of) the 
process, is referred to as the 'Enter' instruction in the 
literature. The execution of an 'Enter' instruction causes 
Switching of protection domains. In a system using 
capability based addressing and protection, a _ protection 


domain is characterized by a set of capabilities. 
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The principal motivation in choosing capability based 
addressing as the basic addressing mechanism in the proposed 
architecture was to provide efficient architectural support 
for implementation of packages and abstract data types in 
Ada. Inveadditien® tobrproviding ‘this support, the use lof 
Capabilities and capability based addressing facilitates 
flexible Sharing and run time implementation of 'the 
principle of least privileges'[GrD72, Fab74, SaS75, Lin76, 
Mye82]. 

In this proposal a protection domain is associated with 
a packet. The granularity of protection might be considered 
to be coarser than most of the capability architectures 
proposed earlier (e.g., system 250, IAPX 432, Cambridge CAP 
computer etc). It introduces the notion of ‘user controlled 
granularityoeofbiprotection' fovine ithe “senses/\that’arpacket 
object could represent one or more than one subprograms. 
The finer granularity could be easily achieved if each and 
every subprogram is compiled separately and thus would get 
represented aS separate packets. This option is provided to 
facilitate efficient execution of Ada programs. 

Capabilities have obvious similarities with. the segment 
descriptors used in systems using segmented memory 
management. The principal difference is that capabilities 
represent systemwide unique names and handling of 
Capabilities does not lie with any privileged state of the 
system. Thus capabilities may be easily passed between 


protection domains, providing flexible “but controlled 
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Sharing of data and procedures. The use of capability based 
addressing as an uniform method of addressing shared objects 
has been ably demonstrated by Fabry [Fab74]. 

An important requirement for implementing capability 
based addressing and protection is that there should be some 
basic mechanism in the architecture that distinguishes 
capabilities from data. There are essentially two 
approaches to provide this distinction. A brief explanation 
of these two approaches for capability based architecture 
design is given in Chapter 2. 

In this architecture the capabilities are considered as 
a primitive data type in the machine and the definition of 
the architecture prevents any user program to fabricate 
capabilities. Every object in the architecture is addressed 
through a capability. The objects in the architecture that 
are protected through the capability mechanism are: the 
process stack segment, packets and objects created in the 
heap. 

One common objection to the use of capability based 
architectures is that if every address is resolved through a 
capability, the additional indirection implied in the scheme 
leads to inefficient execution. It. is almost universally 
accepted that a system that allows implementation of 
principles of ‘least privileges' and 'fail safe default' 
(iveovedaccess® based on explicit authorization) provides a 
more secure and reliable computing environment [Sas76, 


Den80, Den82, Mye82]. A system using a capability based 
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protection mechanism mahadwvedoiin allow the implementation 
of these policies. But the execution time overhead implied 
in frequent domain switching makes these architectures 
unattractive. The proposals of architectures that use 
tagged capabilities [Mye82, Den80, Jag80, Wil72] 
Significantly reduce this overhead (explained in chapter 2). 
The use of special registers to facilitate capability 
addressing [Eng74, Den80] leads to reduction of the overhead 
of indirect addressing. The proposed architecture uses 


capability registers and tagged capabilities. 


1.3 Organization of the thesis 

The next chapter includes discussion on the two major 
approaches to representation and handling of capabilities in 
the existing capability architectures. It also contains a 
brief discussion on the advantages of the tagged capability 
representation and 'self-identifying' data (in general). 

Chapter 3 deals with the issue of architectural support 
for variable addressing in Ada. Two new complexity measures 
for exo-architectural components are developed and the 
measures are used as the basis of a design methodology for 
designing architectural support for variable addressing. 
The methodology is finally used in choosing the 
architectural Support for variable addressing in the 
proposed design. 

Chapter 4 describes the proposed design. All the 


distinctive features in the design are explained in separate 
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Subsections. The rationale for choosing a stack oriented 
architecture for Ada is also presented in this chapter. 

A summary of the semantics of the instructions proposed 
in this architecture is provided in Chapter 5. 

Finally, the last chapter discusses the results of this 
research and includes some suggestions for further study. A 
few critical remarks on two existing architectures (SWARD 
and IAPX 432), in the context of execution of Ada programs, 


are also provided. 
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Chapter 2 
Two Approaches to Capability Architecture Design 

The integrity of capability based addressing or 
protection mechanism totally depends on the protection of 
the capabilities. It iS important to distinguish between 
the capability information and data stored in memory. 
Essentially there are two approaches to protection of 
capabilities. These two methods distinguish the two 
approaches to capability architecture design. The two 
approaches are: 

a. the partitioned memory approach 


b. the tagged memory approach 


2.1 The Partitioned Memory Approach 

The partitioned memory approach is a natural extension 
of the virtual memory mechanism, In virtual memory systems 
uSing segment descriptors, the segment descriptors are 
stored in segments that are only accessible to_ the 
Supervisor state of the machine. In the partitioned memory 
approach to capability architecture design, capability and 
data information are stored in different types of segments. 
The data words and capability information are never allowed 
to reside in the same segment. Some of the architectures 
designed using this approach are the IAPX432 [RaL81,Mye82], 
the Cambridge CAP[NeW77] and the System 250 [Eng72]. 

The Capabilities are stored in segments commonly 


referred to as capability segments, C-lists or access 
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segments. All the references made by a program are 
interpreted indirectly through a current Carist. As 
mentioned earlier, a C-list or access segment is associated 
with a domain of protection. The capabilities stored in the 
C-list determine the objects that could be addressed by the 
program referring the list. The change of current C-list is 
equivalent to a domain switch operation. In the partitioned 
memory approach the capabilities serve the purpose of 
information indentification over and above addressing and 
access control. 

The capability identifies the information in the object 
it names, through the kind of access it allows to _ the 
object. Usually access rights present in a capability could 
be data rights or capability rights. A capability can not 


have both the rights to the same object. 


2.2 The Tagged Memory Approach 

In this approach the capabilities are distinguished 
from other information through tags provided in every word 
of memory. Thus in this representation every word in the 
memory is 'self identifying' [Feu73]. The capabilities are 
protected abe thenpawordotlevelitby va Uspecrficeetagethat 
distinguishes a capability from other information and allows 
specific instructions to use the capability information. 
Therefore mixed segments containing both data words and 
capability words are allowed in the architectures designed 


with this approach. Some of the architectures of this kind 
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are - the IBM System 38, and the IBM SWARD. 


2.3 Domain Switching 

As mentioned in Chapter 1, efficient domain switching 
is of utmost importance in capability architectures. The 
Operation of domain switching in the two approaches will be 
examined in this section. The comparison is important in 
the sense that the efficiency of domain switching dictates 
the choice of a particular approach of capability 
architecture design eee the proposed architecture. 

Each object in a partitioned memory machine is 
represented by at least two segments, one containing the 
data and one containing capabilities. In a tagged machine 
the data words and the capabilities can be stored in the 
Same segment. This feature directly influences the number 
of segments involved in the representation of a process. In 
the tagged approach, the number of segments required will be 
less and thus fewer capabilities will be required to 
represent the domain of a process. 

In a partitioned memory approach, a domain switch 
involves changing of two segments (e.g., consider the ENTER 
instruction, as proposed originally by Dennis and VanHorn 
[Dev66]): 

(i) change of segments representing the code, i.e., a 
new code segment will be entered by executing the ENTER 
instruction; and 
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domains. 

The mechanism is explained with the figures 2.1(a) and (b). 
The figure 2.1(a) depicts the state before execution of the 
ENTER instruction and the figure 2.1(b) represents the state 
after execution of the ENTER instruction. in figure 2.1(a) 
the process is considered to be executing in the _ protected 
procedure represented by the segment P1, before executing 
the Enter instruction. The execution of the Enter 
instruction causes switching of protection domain and the 
process executes in the protected procedure represented by 
the segment P2. In figure 2.1(a) the program counter (PC) 
specifies the capability for the code segment P1 and the 
offset in the code segment containing the Enter instruction. 
The ENTER instruction specifies the capability pointing to 
the new C-list2 and also specifies the offset in the new 
C-list (containing the capability for the new code segment). 
A domain pointer DP, shown in the figure, points to the 
C-list representing the current protection domain. 

In a tagged machine, the ENTER instruction may only 
need a change of code segments to enter a new procedure, 
Since the capabilities required by procedures can _ be 
embedded in the code segment. The protection domain could 
be represented by the capabilities present in the code 
segment. The domain change operation in tagged capability 
architectures is shown in fig 2.2. Figure 2.2(a) represents 


the state before execution of the Enter 
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Figure 2.2(a) 


Figure 2.2(b) 
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instruction and figure 2.2(b) depicts the state after 
execution of the Enter instruction. The program counter 
represents an address in the segment number, offset form. 

Moreover it should be noted that the addressing 
mechanism in a tagged capability architecture is much 
Simpler than the architectures based on the partitioned 
memory approach. To address a word using capability~ based 
addressing, a capability for a segment and the offset of the 
word relative to the base of the segment must be specified. 
In the partitioned memory approach, these two components of 
the address can not be stored together in the same segment, 
as the offset is represented as a data word or as a portion 
of an instruction word. For example, an address to a data 
word is specified as a triple (i,j,k) in the CAP Computer 
[NeW74]. The component i selects one of 16 capability 
segments, j selects a capability from the segment and k 
specifies the offset. This extra level of indirection 
required to specify a capability in one of the capability 
segments iS not required in tagged architectures. An 
address could be directly specified by a (capability, 
offset) pair. 

The above discussion indicates that tagged capability 
architecture allows faster domain switching and operand 
addressing. These were the two factors that were considered 
for adopting the tagged capability approach in the proposed 


design. 


me few a 7A 
re 


: 3 7 
¥ ' ' * & ov 
- be Vas 8 : >>" ae Gal 


‘eraned mayen iag ent... tors ptans wnt Avena x a = 


geet parts dbdmun sronpee Soe it ial Rai 


oereserkbs. add fade Basco o> (DINONG « ee 


Nan aie es: 


S owbiiiaees -oniat’ 974m H ge90bn" OF BENGE Om 
| se 


Son bragMed ‘ows Beers Oto (1c gs yore whleteR eR h! + 


“ . ails x ‘7r), a 4 Sir He IOS 6@- 3406 ies 


Se siPbe ie «+ 1 GiTsaes. 1p4 say ngs owayenl 
} Phen 


ind Bde) nid Fyt)) seetad 6 Bay Pi gbuten a 
i % 
ladas, Bt 38 8 laete sibsdermds., see eo } 


We ‘ rs a _ 
L. Gas -Irsgpe) Sry Se yi¢tita coe4 eds RlDe hea" Cy: 
i | . Sd pM aie 
Mistssstbné id, Levelt stoxs aie) be seasie ) el, al : 
| | sas 


: a ral’ Leas é iysode2 
7 Toe ee Shas 
As Kigd tg! — bela ne tara 
TA (266i T2i0 3165 GBypes ry DST Ais 3) . Jos 
j | iawy © ‘Le oa “ fe tosiitk Qc 


fh 


{i2a tC RGgesS Heopsd Sens S925 rs jttatuos ik a vodka 
BiB ISaS “ans: ‘nhl fon ive 
bei 9Peenos eyaw Jets aier726: owh Soy suey ‘sagen 


‘Senogetg <add al doses +2! i ideges, bapped. iia 3 


re 
om 
“le 


23 


2.4 Tagged Architecture 

The tagged capability approach to capability 
architecture design implies that each and every word of the 
memory would have a tag field identifying the word. It is 
important to analyse whether the tagged memory approach fits 
in with the other objectives of the proposed architecture 
specifically oriented towards Ada. 

The uses and advantages of tagged architectures have 
been adequately demonstrated in the literature 
[Feu73,Mye82]. Some of the salient features of tagged 
architectures are summarized below: 

a. In a tagged architecture the data representation is 
self identifiying. The self identifying nature of 
data allows use of generic instructions. The 
architecture defines generic instructions, thus’ the 
instruction set 1s simpler. 

b. Tagged architectures allow automatic data 
conversion. It has been shown by Myers that a large 
portion of the execution time is spent in executing 
compiler generated code for data conversion in 
non-tagged architectures. 

c. In tagged architectures the binding of instruction 
to data attributes is deferred till the execution 
time. This provides execution speed benefit for 
languages requiring this feature of late binding. 

d. Type checking at execution time is automatically 


done in tagged architectures (the necessary 
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sequences for type checking are embedded in the 
microcode/hardware). This feature eliminates the 
necessity of compiler generated code for run time 
type checking. 

There are two approaches to the design of the tag field 
- long tags and short tags. The long tag approach includes 
type as well as descriptor information in the tag field. 
The usefulness of long tag fields have been demonstrated by 
Gehringer [Geh79] and Myers [Mye82]. The type identifier 
part of the tag specifies the format of the descriptor and 
the descriptor part defines the format of the contents of 
the 'cell'. A cell can be arbitrary expanse of memory. The 
long tag approach allows representation of arbitrary cell 
types and facilitates implementation of the run time 
Structures for languages having execution time binding 
features. 

Ada is a_ strongly typed language [Hor83]. Excepting 
for the few cases (mentioned in Chapter 1), the type 
compatibility of operands could be checked at compile time. 
Thus use of long tags will not provide any execution time 
benefits for Ada programs. As mentioned earlier, Ada 
requires execution time support EOL run time 
range-constraint checking and bound checking for array 
Subscripts. 

The proposed design uses 4 bit tags. The rationale 
behind the design of the tag field is explained in Chapter 


4, The primary motivation for using tagged approach is 
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hardware protection of primitive data types in the 
architecture (quite similar to the rationale for tagging in 


5700, 6700 and 7700 series of machines from Burroughs 


[bor79)]). 


Chapter 3 
Architectural support for Variable Addressing in Ada 

The importance of usage statistics of various high 
level language (HLL) constructs in the design of the 
instruction set of a language directed machine or the 
execution architecture for the HLL is well understood 
[Mye82, F1H83]. The choice of the technique used in 
interpreting the directly-executable language (DEL) [F1H83] 
constructs or the instructions of a language-directed 
architecture thoroughly influences the performance. Tithe cits 
interesting to note that no well established design 
methodology for choosing the interpretive mechanism or _ for 
determining the hardware/firmware support for an efficient 
implementation of the interpretive mechanism is not’ known. 
The research presented in this chapter might be considered 
as an attempt towards formulation of such a design aid. 

Ada is the HLL under consideration in this thesis. It 
is one of those HLL's where the variable referencing 
environment is determined by the static block structure of 
the programs [Hor 83]. Scope rule enforcement for the 
variable access mechanism represents a characteristic 
feature of such languages. The choice of the technique for 
implementing the variable addressing mechanism directly 
affects the complexity of the Pre Call/Pre Entry & Post 
Return/Post Exit (Entry and Exit correspond to Block Entry & 
Exit respectively) sequences. These sequences are not 


directly involved in variable addressing but they are 
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essential overhead for maintenance of the mechanism. 

It is well known that procedure Call/Return & Block 
Entry/Exit are two of the important and frequently used 
operations in a block-structured HLL environment. So they 
are definitely candidates for having semantically equivalent 
Gonstnucts lain athe ‘Sexecution@carchitecture afor ‘the «HLL or 
instruction set of language -directed machine. The one to 
one correspondence of such functions in the instruction set 
leads to effective use of the available processor/memory 
bandwidth [Mye82]. This in turn implies that an instruction 
like Call (for example) includes the firmware implementation 
or hardware control for the Pre Call sequence over and above 
the usual microcode or hardwired sequences for transfer ie: 
controly The execution performance of such instructions 
will depend on the technique used to implement the variable 
addressing mechanism. The designer of a language directed 
architecture (or the interpreting mechanism for a DEL) is 
faced with the problem of correctly choosing the most 
efficient implementation technique. The discussion to 
follow is quite different from DePrycker's analysis 
[Dep82a], in the sense that he has attempted to evaluate two 
different variable addressing methods on existing target 
architectures. The emphasis in this chapter is primarily on 
the development of a methodology for designing the 
hardware/firmware support for such HLL functions for a 


language directed architecture. 
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Various techniques ae been used by system designers 
to implement the variable addressing mechanism in block 
structured languages. Four competitive techniques that have 
appeared in the literature are considered. The techniques 
to be considered are: 

a. the classical display implementation as suggested by 
Dijkstra [Dij60, Dep82]; 

b. a modified display implementation as suggested by Rohl 
[Roh75], [BiB80]; 

c. local display implementation as in ICL 2900 Pascal 
compiler [Ree80] and also chosen for a virtual 
architecture for Ada [Dom80]; 

d. implementation of Tanenbaum's proposal [Tan78]. 

When the architect decides to support some of the HLL 
functions through an exo-architectural support [Das82a], the 
set of such supports for a particular function(s) will be 
referred to aS an exo-architectural component or simply an 
architectural component. For example, the set of the 
architectural supports for the HLL functions of procedure 
Call & Return could be viewed as an architectural component. 

An examination of the execution of such architectural 
components on existing Von-Neumann style of architectures 
reveal that the transfer operation is by far the most 
predominant operation. The transfers could be between (a) 
two memory locations, (b) a processor register and a memory 
location or (c) two processor registers. Sooute tne dequice 


reasonable to compare the implementation techniques in terms 
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of number of transfer operations required to implement an 
architectural component on a target processor. If we 
consider appropriate weights for the different kinds of 
transfer and the necessary statistics of program behaviour 
are known, the average cost of execution of a particular 
implementation of an architectural component (in terms of 
transfer weights) on a host processor could be obtained. 
This is essentially a technique for analysis of suitability 
Of host processor support for implementing some HLL 
functions. A somewhat similar study for two of the above 
mentioned implementation techniques (a & 4d) was done by 
DePrycker [Dep82b]. But the method proposed here should be 
essentially considered as a design aid rather than a 
technique for analysis (though it could very well be used 
for that purpose in subsequent phases of the design 
process). To start with, there is no need to assume any 
underlying hardware/firmware organization of the processor 
excepting in that typical Von-Neumann style of processor 
design [Das83a] will be adopted. 

In the absence of any assumptions regarding the 
underlying hardware/firmware structure, it is necessary to 
use an abstract way of describing a particular technique for 
implementation of an architectural component. The computer 
design & description language (CDDL) S*A [Das81, Das82a, 
Das82b] is used for that purpose. From an S*A description 
of an architectural component, two measures could be derived 


that are termed as Virtual Transfer Complexity (VTC) and the 
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average VTC(AVTC). As explained in¢Sectionn3e2, (theAVTC 
measure is only a design aid in the sense that it indicates 
the minimum transfer complexity that could be theoretically 
achieved using a particular technique of implementation. 
One of the important characteristics of an S*A 'mechanism' 
is that the description of the mechanism remains independent 
of any change in the underlying support. The same is true 
for the AVTC measure except in that it is dependent on the 
usage statistics obtained from the programming language 
environment under consideration. | 

In the next section, after presenting an overview of 
how the descriptions of the architectural components using 
the CDDL S*A facilitate the design process, the notion of 
Virtual Transfer Complexity will be developed and_ some 
illustrativeneevaluation-wof #viGrifor a afew )itypical. S*A 
Statements will be included. 

Section 3.2 includes a brief review of some of the 
relevant notions of maintenance and variable access in the 
run time representation of a tyical block structured HLL 
environment as that of Ada. Complete S*A descriptions of 
the procedure Call-Return (CR) and Block Entry-Exit (BE) 
components for all the four implementation techniques are 
presented in this section. The section also includes the 
evaluation of AVTC's of the components for each technique. 

The remaining sections are devoted to deriving the 
necessary processor support from the S*A descriptions and 


evaluating theweneat Mtgansferneucomplexity (RTC)gncf the 


oT Wnty eeree-. 7 
f) ve pa , ui ’ 7 % , ® ‘ 
G OV ul « 7) ; 7 


OE ee . SE non nee tet a. eae ae ie 
sssauieed 9 Jay fates’. sil, pr 
Vi IASi getesdr-od -Sldso tansy: 32 pis oc 
—Orpeinsvsiqm). IO  Ss¥Oiness7 aide teow) | 
inet period! ‘sen Ag Be eDkse sd Fe aRee aan ote. 
iMiehAadebor er cats ne (ngibay ede To C Si eg eo 
“a1Th Bt emBe 2: 14300 poe o€t¢]l i sham 12 ia We ‘sgeeta.. 
ae 4 ate Meena ar Sg crs, st ibd OK, ore eam win i oe 
sbsvonae?> pious : O05 2 Ut ; | a iL AWEEE sigeSa 4 
| >t SEBDrena4 1g6nw, te on 
ta eK sti Inewe3c Akt eee gn) ails he 


renedieh. .Lasmisage hin16 eos oh ene dg Raa ™ 
tt 


143. Segsatg fAeteeh off 'S ets Sok" ane 
. ee . a 
U 


wee Bice eves Seo LLés cee ane vasaniant mu : 


mH 


‘iy iaqtwenanl Mech istens 
iG) TA NEAR, AGRE Bea 1). of haw 

O15 ‘eaupradse? AOlissaens ian: 1003 ptt, heme — 

yt oy 

ai} obl isnt oats. NEEAISe “att og Spies aids, ni 
voupind sag vel ges {oul BIionogus 5 eas a9 s WA = 

ait Bitrvetshvos Barvoveb 216 andidase’ onium 

py 


boG 29Ddiguizeeh Axe ardt oak TaKaRa 30) 


att (39° “(ors vVSFxeaiamos oteaee | tien! 


rf ) — 


Sa 


components on this organization. A comparison of the AVTC 
and ARTC values’ indicate that the choice of the 
implementation techniques could be restricted to only the 
techniques suggested by Rohl and Tanenbaum. Moreover, it is 
concluded that the choice between these two techniques 
depends on the relative frequency of procedures being 
declared at an intermediate (neither local nor global) 


lexical level with respect to the calling level. 


3.1 S*A Description and Virtual Transfer Complexity 
The S*A description facilitates the design process as 
follows: 

aeriheieis (\Vpessible to®'’desersbeusthe *ofunctionings ‘of an 
architectural component at a level of abstraction that 
makes the description invariant with respect to the 
underlying hardware/firmware Supporting the 
implementation of the component. 

b. It is possible to formally verify the correctness of a 
mechanism before any further refinement of the 
abstraction in the design process [Das83b]. 

c. The description clearly indicates the dependence of the 
architectural component on specific statistics of the 
programming language environment. This feature 
facrlitates imcollection of @tappropriate statistics cof 
usage. 

aie ‘The description clearly indicates the possible 


trade-offs between M & R measures [DiS79] necessary for 
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performance improvement. 1p 2 all the important 
architectural components of a language directed 
architecture are considered together, the S*A 
descriptions of the components could yield a fairly 
complete hardware/firmware requirements fOr the 
processor. 

The? \Vittual atransferurGomplexkity. (VTC) of the S*A 
description of an architectural component, implemented 
through a particular technique is a measure of the 
cumulative count of number of transfers necessary for the 
execution of the component. While evaluating the count, all 
the local variables (Privars in S#*A) and the global 
variables (Glovars in S*A) are assumed to be available in 
the Processor registers of a virtual Processor. The array 
types are considered as register banks with usual selection 
mechanism and a transfer through ALU is considered as a 
register-register transfer between ALU input to ALU output. 
Presently any parallel execution of S*A statements will not 
be considered, yet any such parallelism could be easily 
accounted for by only considering the statements having a 
maximum transfer count out of a few parallel statements. 

Inhethes néxt \=section* | VTC»%and AVTCSof ithe procedure 
Call-Return component and the Block Entry-Exit component for 
all the four implementation techniques (mentioned earlier) 
would be evaluated. Before proceeding to the next section, 
the g@methodtrtotstevaltat son ekoim VEC: mofo isomeo typicabtUSe*a 


statements is explained. The following type declarations 
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are assumed: 


a. 


type M = Seq [..] bit ; 
typee let; eb =tacrayal. i.) of. MT: 
ee ei ey eum 

SER EEP PHO F 

(An abstract Arithmetic unit is assumed with two inputs 
ALU-1'& ALU-2 and the output ALU-0). 
The transfers are: from P to ALU-1; from Q to ALU-2:;: a 
transfer through ALU and a transfer from ALU-O to X. 


Thus the VTC of this statement is 4. 


[Note: It is reasonable to assume the presence of an ALU 
withmr2ocinputs and) _tetoutput te capable. of seaperforming 
Standard arithmetic and logical operations used in the 
S*A descriptions of the components. In evaluating the 
VTC for ‘increment/decrement by 1' operations, we 
distinguish it from standard Add & Subtract operations 
in the sense that only one operand is considered to be 
transferred to the input.] 

P := A(X]; 

The transfers are: transfer of xX to the selection 
mechanism of the array A; transfer of the selected array 
component to P. The VTC of the statement is 2. 

AX es=eB (PEE ialo)); 

The transfers are: transfer of Q to selection mechanism 
of A; transfer of the selected component of A to ALU-1; 


transfer of -P\to ALU-2; transfer through the ALU; -ALU-0 
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to selection mechanism As B; transfer of X to selection 
mechanism of A; transfer of selected component in B to 
the selected component in A. The VTC of the statement 
Ist?s 
ad. WHILBSPs¥nOs DOR seed: 

If cumulative VTC of the statements between the 
delimiters DO and OD is w and if the DO-loop is executed 
m times, the VTC of the above compound statement is 
evaluated to be mw+3(m+1). The transfer involved in the 


branching operation is ignored. 


3.2 Variable Addressing Techniques 

A program in ae block-structured language may be 
depicted as a tree to represent the nesting of the blocks 
[Dor79]. A block is delimited by begin and end. Thus’ each 
block may be associated with a level in the tree and is 
called the static lexical level. The execution of a program 
may be viewed as dynamic changing of lexical levels. The 
lexical level of execution is called the dynamic lexical 
level. 

From a static point of view (i.e., at compile time) a 
block and a procedure (delimited by Procedure and end) are 
treated identically. The body of the procedure (delimited 
by begin-end pair) is treated naturally as a block and is 
associated with a lexical level one higher than the level of 
declaration of the corresponding procedure identifier. 


Within a block a variable is associated with a (lexical 


d{ P - . ‘ , 


noid pet oe 3.0 ton sadenei2 
) bal f ti > on Ba oS: Se 
eusiete 9 1. TV BG 
} 
A = TE c 
Sti) : os he wre (Sai OF, 1 am 
18 3 feneorsic3eto che sage 
a ; 
2eletindoa ly a vies Fal. atdhiailie 
‘ . f ra B ni insite ” 
33 3 rideen eds) tnseerret dn: 


= ° i 

bie (Rag Aiped vi Hassel es) at bo Paks 

’ se ° ' “he | 

isvei- 8) of 1 SITELOIEES, od wen 

= ; ; , ¥ "aoe 1 L a x : Tf 

thee 6 oO "OdtueexedeuT . tous i Sea eatel) ot tae eas, 
‘ 7 ’ r - ® e ll 4 : 

0 conve. Legetes Prreied.. cee aK aoagiy 

t ay ae = 
fsoixel coimseyh -Sid Gellns- sr /aeapeeeeny Io) 7 


. : : ~ 4" 
3 Sri. t <j] PTA 3Bi 4s ‘9 ALY ax) 1 PEOG “462% 


Oe 
$35 (One SOs sabesext ya Darien leh) eehbear wz, ¢ 
| } x 
| i io : - 7 1 
beyinelsp). exvesgaia, per? 7's (God eff §.¢f ban seqebe 
‘ t : 
ai Sng €90id) eae wilasotgs betesed et (ier 


+0 (ovek on? eas Geipid sno leyel Tpsteet « 


: ahs tiheos 


se 7 oy ¥ . 
era HaRS ta EiERAOQR IL IGS... acid . 76. 


i eas w ‘Ba deiscdan ‘az ot awit aK 


7 mit 


35 


level, sequence number) pair. The lexical level component 
of the pair corresponds to the static lexical level of the 
block where the variable is declared, the sequence number 
indicates the sequence of declarations in the block. 

The dynamic change in lexical level occurs due to a 
block invocation or a procedure call. The nature of change 
in each case could be quite different. The new dynamic 
level due to a block invocation corresponds to the static 
level of the invoked block in the static program structure. 
A procedure invocation changes the dynamic level to 
correspond to a level one higher than the static level of 
declaration of the procedure identifier. The exit out of a 
block or return from a procedure requires the dynamic level 
to be reset to the level of invoking the block or calling of 
the procedure. 

In a block-structured language, a variable may be 
accessed if it is declared in the same block or in a 
Statically surrounding block, i.e., when the variable is 
declared in one of the levels corresponding to the nodes in 
the path from the root to the node at the level of access in 
the tree. So the set of nodes on the path from the root to 
the active node could be considered to be characterizing the 
lexical levels involved in the dynamic variable accessing 
environment. In case of procedure calls, the procedure 
identifier is treated aS a variable -asociated with «a 
(lexical level, sequence number) pair and so the rules for 


procedure invocability is the same as that of accessibility 
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of a variable. 

In the Ada environment, blocks could be treated as 
degenerate procedures called from the level where they are 
defined. The above discussion indicates that two linked 
list structures (or stacks) are needed to appropriately 
represent the Static and dynamic structures of the 
environment. Usually the evaluation/allocation stack is 
combined with these two structures to form an activation 
Stack. The stack frames corresponding to the static program 
Structure are linked through a static link and the history 
of dynamic program activity iS maintained through a dynamic 
lanky In teaches stack frame. It might be noted that the 
dynamic and static links in a frame allocated due to a block 
invocation contain the same values and merely point to the 
previous frame. Ina frame allocated to the body of a 
procedure, the dynamic link points to the previous frame but 
the static link points to the frame where the procedure 
identifier was allocated. The dynamic link information is 
used in reverting back to the dynamic Caviang or 
block-invocation level after a return from a procedure or 
exit from a block. The static link information ,is sutilised 
in accessing non-local variables. 

With the above description in mind, accessing a 
variable in the parent static environment requires following 
the thstaticdelinkimchain” tom the appropriate level of 
declaration. Thus depending on the lexical level difference 


of access and declaration, a variable access could involve 
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quite a few levels of indirection. Dijkstra proposed the 
technique of using an extra set of display locations (as a 
Stack) to reduce this overhead [Dij60, Bac79, Hor83]. In a 
variable accessing mechanism implemented using Dijkstra's 
proposal, any accessible lexical level could be directly 
reached through the corresponding display location 
associated with that level. The top of the display stack 
points to the presently active frame and the remaining 
adrepleyecontains Sthewcopy  @ofaythe: sstaticrilinkst@of vthe 
accessing environment. The exact description of the 
sequences involved in procedure Call-Return and Block 
Entry-Exit are described in the S*A description of the 
mechanism M,. For ease of understanding the description, 
Fig 3.1(a) depicts the base of a stack frame used by this 
technique. A major problem with this classical technique is 
in rebuilding the display bank on return from a procedure; 
the number of locations to be reset is directly proportional 
to the lexical level difference between the levels of 
calling and declaration of the procedure. 

A modification to the above technique was suggested by 
Rohl [Roh75]. He observed that there was duplication of 
information among the static links, dynamic links and the 
display. The sequences involved in procedure call-return 
and block entryexit for the suggested modification is given 
in the S*A description of the mechanism Mz. In this’ scheme 
the static and dynamic links are replaced by a single link 


that links only the frames that are associated with the same 
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lexical level. This implies that whenever the content of a 
display location is overwritten (for a call or entry) the 
corresponding information is stored in the stack frame 
associated with that dynamic lexical level. So it could be 
observed that the display rebuilding overhead after a return 
from a procedure is independent of the lexical level 
difference between the calling level and the level of 
declaration of the procedure. The rebuilding involves only 
the resetting of a Single display location associated with 
the static lexical level of the procedure body. Figure 
3.1(b) depicts the base of a stack frame when this technique 
is used. 

Another variation of a technique that implements 
variable addressing through display locations is generally 
known as the local display method [BiB80, Dom80]. The 
procedures for procedure call-return and block entry-exit 
uSing this technique are described in the S*A mechanism M3. 
In this method the display locations necessary to represent 
the current variable accessing environment are stored in the 
current frame of the activation stack. The current frame is 
deallocated on return from Procedure or exit from a block. 
Thus there is practically no extra overhead for rebuilding 
the display on return from a procedure. 

A significant variation from the above techniques was 
suggested by Tanenbaum [Tan78]. His observation was that 
most of the variable accesses are local or global in nature 


(not exactly true for an Algol-like language) and thus 


il$o9 6 SATa00RGedas eo preshDipeos viiudleaide: sae h 
Levs om 4s ef ue frehrgachat ei) ny 
level sds hab foveal ats Llao oie seactad 4 

m aevioeni pis! ees off! LewgSeachg ae = a: ir 

rc 


i) 


Sejaruueaa, ‘Rt sot welts eons 2 fenea 


Swit... ,¥bed -sidhesovy 442 .avel lentual, saa 


oot Teuets 2 Ct =e 5 a — 20 AP 6a out asod¢ il 


a 
Bidens Lom: 603. $supioige » te noditciony ohh 
igp Bi aactyapol whieh dpeghas, ee eames 
ent OB aee S848 » hartiom ‘ate | bas 1h bain tr wo 
| AZ eis |, 
faxs-yasns, 9910 Ons Anvv7e 3 - bie- eae sini -&o 7 05 
ite inet int eee anes Bont ib at deapaity raha’ a9 @ 


toseandsy of yseeeeten eioiyaier (a aby 2 TM 
my | . =a Das 
S02 NS 8312 -S3e JRO Vne*c er eigsiis? 2a : 
2: 95%. Sissalbalent Jje73 nossa outae hei »'% erate 
6 ¢ 


SB ,? 


sABOLG a no y? Pixe he) S70 De pone anieens mis no betss ’ 
pribtivdes 37 'Sasdtave) otiKe ‘on pita bi da630 2 oui 


' 
PSIYOS G74 &. ge5y Wats ae qed wa : 
ee : 


25¥ eoupsmigsy svode. sit, most nolaaider 3 | sialon 6 Sn 
aan 


Wek 


i642 2a”, volssvisete- Zin ie cy ant] fs achccs ry ve 
, os 


iy ; 
— | 7 


tsar ibe Deis lo 5 Se50L ie. 


F ae on ttoae 
ne 


ae : 


— BItat Nay @ 
aA 


fs. ye2- oud 


=F / 


5° 


ee) 


dedicated display locations for non-local references were 
not necesSary. He proposed the use of two dedicated 
locations or registers for pointing to the bases of the 
current frame and global frame of the activation stack. The 
procedures for procedure call-return and block entry-exit 
are described in the S*A mechanism M,. In a previous 
analysis, DePrycker [Dep82a] ignored the overhead involved 
in accessing the procedure identifier during a procedure 
Call, when the procedure identifier is neither a local nor a 
global variable. In Tanenbaum's description onde overhead 
is included in the precall sequence (Mark instruction) 
[Tan78]. In this technique, any variable that is neither 
local nor global is accessed uSing the chain of static 


links. 


Note: [In the S*#A mechanisms to follow, lexical level of the 
root node of the program tree is considered to be 1. 
Moreover, a variable is associated with a (lexical level, 
offset) pair, where offset is with respect to the base of 


allocation/evaluation area im)a Static’ frame]. 
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Figure 3.1(a) 
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DR (Dis top) 


DR (Dis top) 
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Alloc./Eval. 


Local display 


Frame Mark 


Prev. 


Figure 3.1(c) 


Alloc./Eval. 
Area 


SOLL 


Figure 3. 1(d) 


Figure 3.1 : A View of the Current Stack Frame Just 
After Execution of the Procedures (a) M;,.CALL 
(b) M2.CALL (c) M3.CALL (d) M,.CALL 
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Mech M, ; /* DIJKSTRA's DISPLAY MECHANISM x«/ 
type memword = seq [..] bit; 
glovar mainmem: array [..] of memword; 
syn Stack = mainmem; 
glovar Sptr, Frame-mark, Pctr : memword; 
glovar Display : array boasbent -1] of memword; 
glovar Inst-reg : tuple 
adres” tuple 
Lex: seql..] bit; 
Ortsetet Seq (18-38 bits 
endtup 
endtup; 
:Memword; 
privar local, stat, ptr : memword; 


Privar Dis=top : seqviband)) bit 


Proc CALL; 
Stack [Sptr] := Frame-mark ; (2) 


Framecmark S=@Sptrais@iy) “Spere=Sptr + 1 +; (3) 


Stack [Sptr] Disacorpmmnc wesptr ¢= sptr + 1 +: (3) 
/* stored the 'calling' lexical level number */ 
Stacks (spams :=w Peer lee )eSper r= Sptr® + 13 (3) 
/* Stored the return address */ 

Stacke[Sptrd> s=oDisplays1pis=-top) + (3) 
Sptralevcptrase i107) (3) 

/* Stored the dynamic link */ 


Dis-top := Inst-reg. adr. Lex (1) 
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/* Dis-top gets the lexical level of the block where 
the procedure identifier was declared */ 

Stack [Spter]os= Display [Dis-top] ; (3) 

Sptmi:= spt at, (3) 

/* stored the static link +#/ 

Pétr %:=) Stack [Display[Dis-toptInst-reg.adr.offset]]; 

/* Address of the code segment of the called 
routine is loaded in Pctr */ 

Dis stop ive iDis-— top “hylan? 

/* Dis top contains lexical level of the declaration 
of the procedure body #*/ 

Display lDis-top) += 'Sptmses (2) 

endProc; 


Proc RETURN ; 


Sptr := Frame-mark ; (1) 
Brame-mark := "Stack PSptr] =; (2) 
Local := Display [Dis-top] ; (2) 


Dis-top == Stack [Localmaaan 2: 665) 


/* Calling lex level in Dis-top ie 
Petr #= Stack [Local = 33)))#a05) 
/* Return address in Pctr */ 
Display EhBisstop ]/*: ='StackeiiLocal — 2] (6} 
/* display [Dis-top] points to calling environment */ 
Geatine Stack elbocal =e) =) (2) 
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PLY 2=) Stack [Display [Dis-top] Stl 45) 
/* Stat and Ptr are used to facilitate rebuilding 
of Display */ 
Local  ="Drsstopes (4 }8Lecalisetbocal .=)1 (3) 
/* Dis-top is kept undisturbed */ 
WHILE Display [Local] “]= stat (3[n+1]) 
DO Display [Local] := Ptr ; (2) 
Ptr $= *Stackbeecaiie © (4) 
/* Ptr is used in traversing the static chain */ 
Local := Local = 1 ; (3) 
OD ; /* (n+1) = lex level diff. between 
the calling level and level where 
procedure identifier was declared */ 
endProc; 
Proc ENTRY; 


STACK [Sptr] := Frame-mark ; (2) 


aa 


Prame<hark:= SeuremuuherSptr <=—Sptr + 1% (3) 


Stack [Sptr]-:= DisplayeibDis-Top] ;: (3) 
See = 5 Spt te “ian (oy) 
/* Stored dynamic link #/ 
Stack ([Sptr)) :="Displaytees-rop! -;,(3) 
Soenese Sobre sats 
/* stored static link; though it is unnecessary. */ 
/* for the sake of uniformity */ 


Dis-top Y= 9D1s—top +) Wai? 


Display [Dis-top] Spt s8t2) 


endProc; 
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Proc. “EXIT: 
Sptr s= Frame-mark ; (1) 
Frame-mark := Stack [sptr] ; (2) 


Bishop (= (Die=-toperay + .43) 


endProc 


endMech M;; 
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Mech M, ; /* ROHL's METHOD */ 


type memword = Seq [..] bit; 

glovar mainmem : array [..] of memword; 
syn Stack = mainmem; 

glovar Sptr, Frame-mark, Pctr : memword; 


,gbtl -1] of memword; 


glovar Display : array [0. 
glovar Inst-reg : tuple 
adr : tuple 
bex.eSeqixecd bit> 
Qitees “eink Jebit: 
endtup 
endtup; 
: memword; 
privarsDis-top : Seq vib.«.0) bit 


privar Local : memword 


Proc CALL; 
Stack [Sptr] := Frame-mark +; (2) 
Frame-markeseeSptrags (4) 8 Sptrs=yeptrit+ 1 ; (3) 
Stackwlootul#:= Dis-top eri(2asptr :="Sptr 4b 2 (3) 
/* lex level number of calling level stored */ 
Stack #iSecr li sscectr Fy) (2) eeptr <= Sptr + 193) 
/* ret. addr. stored */ 
Drs-topes=1nst-reg. adr. sex (1) 


Petr := Stack [Display [Dis-top] 


as 


hrvowmen to |. 


‘oO Towle? ta 1° 


J f j s 
:7 4 ‘. a, a 2e% ¥ 
Tons 
: 1g 


Sig) (0... b mad 4 ‘os ard, 


gl 


eee aN Lael, 


; | aS 


(Sy) -: inaem- ieee’ ea (gg ansa 


a ee ae bvnaeasdlhe? Si 
= ) | - 7 \ 
Ley oo. iS SSgerenViIGe is). qnanehs ms Leto} “a rh 
7 io 
\# (BSaote, Savel pitilasde sedate Lavet vied ; 


(t) 418 g9Ge Sf. scge (5 aus 5258 ay alee 


47 


+ Inst-reg.Adr. Offset] (6) 
/* Pctr is loaded with addr. of code segment of the 
called procedure */ 
Sptrk:=OSprr +¥ 12403) 
DrsabopreaDis=top 7 1 + 03) /* Dis-top points to the 
head of present display; Lex level of the block for 
procedure body is 1 more than the block where Proc. 
identifier was declared */ 
Stack (Sptr]) (s= DisplayaiDis-top) < (3) 

Somer = sper + 17; (3) 

/* Content of Display location is stored as it will be 
overwritten with a pointer to new level */ 
Display [Disetop] s+: =eSprae, (2) 


endProc; 


Proc RETURN ; 
Sptr = Frame-mark ; (1) Frame-mark := Stack [sptr] ; (2) 
/* Deallocation of frame */ 
Boca’ <= Display [Dis=topla (2) 
DisplayeDue-top] “="Stackeruceal—1]) (5) 
/* reinstated the content that was destroyed due 
Boucall+/ 
Dis-cop s=astack [Local j-gclao) 
Pctr t= Stack. [Local —V219% 35) 


Enaeroc.- 


Proc ENTRY ; 
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Stack (USptteiec= oes G2.) 

Rramepmatkec="Sptu; C1) @Sptr := Sptr + 1; (3) 
Dis-top f= Dus-too, + 13 .-(3) 

Stack BESptr i = "Display Wus-Top)] >; Sptr := Sptr + 1 >: 


/* As -in-call */ 


Display [Dis-top] <=eSptr sr (3) 
/* Display [Dis-top] points to present environment */ 


endProc ; 


Proc EXIT ; 
Sptr := Frame-mark ; (1) 
Frame-mark := stack [sptr] ; (2) 

Display [Dis-top] := 

Stacks[Display [Dis-top] - 1] ; (6) 
/* restored the content overwritten due to 

Block entry */ 

Disetop = Drs=top = hee) 


EndProc ; 


End Mech M2 rs 
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Mech M3: (/* [LOCAL = DISPLAY +*/ 


type memword = seq [..] bit; 
glovar mainmem: array [..] of memword; 
Syn stack = mainmem; 
glovar Sptr, Frame-mark, Pctr : memword; 
glovar Inst-reg : tuple 
adr : tuple 
Bex seq(..) bit? 
Offcer isseq [..] bit: 
_ endtup 
endtup; 
: memword; 
glovar Base : memword; 


privar Ptr :; memword ; 


Proc CALL; 
Stack [Sptr] := Frame-mark ; (2) 
Erameamarkes="Sptr se Sper:= Sptr + 1 ; (3) 
Stack [Sprenjm:="Petre me Gzpmopeat:= Spt) + 1 ; (3) 


/* return address stored */ 


Stack Sptr]s:= Base +; (2) Sptr Sptr 7 a3) 
/* Aynamic link stored */ 

Pires bases (1) /* copy of dyn. link. in Ptr */ 
Base := Sptr ; (1) /* new base assigned */ 


/* The following loop builds up the necessary local 
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display 


in the present frame */ 


WHILE Inst=reqt adr.’ Dex7]= 0 (3[ 8 + 1)) 


DO 


Stack Sptr les =mstack [ptris*: (3) 


SpereramSptea tam (3) Ptr. := Ptr + 
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Tele eo (23>) 


Instrreg.* adrae lex inst-reqivad.* lex: -( 4) (3) 


OD 


‘ea /eig = Lex devel of called 


procedure identifier 


StackeriSpcerisn= ‘Sptrute ier (45 


/* 


Potro ;= 


VeePetr 


topmost display stored */ 
Sper gamopt mr tlh C3) 
Stack [Stack [Base + Tlex] 


+ Enst-réeg.admeportset ls; (7) 


* / 


is loaded with addr. of the code segment of 


the called pgm. */ 


EndProc; 


PLOC RETORN) ; 


SDtra= 


Petr i= 


Base := 


/* Base 


EndProc ; 


Frame-mark ; (1) 

Frame-mark := Stack [Sptr] ; (2) 
Stack [Baise — 2d) 105) /* ret. adr. 
Stack [Base -— 1] = (4) 


points to base of Calling Frame */ 


loaded x/ 
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Proc ENTRY 


STAC 


fon 


Stac 


WHILE 


DO 


OD; 
Stac 
Sper 


EndProc ; 


Proc EXIT 


Spbr 


° 
, 


K [Sptr] := Frame-mark ; (2) 


Frame-mark := sptr; 


Spry: = Spt oltodmeo (3) 
ew frame allocated */ 


kel Sptr] SSaBbasearR(27eSptr semSptn;+ 


:= Sptr ; (1) /* new base */ 
Stack [ptr] -- 1°) =@ete Céehve+ 1)) 
StackulSptr)] := Steck alporde: (62) 
Sper eopt ret) 1) ante 


ptr weertr + t - «C3) 


1 


Mai) 


e 
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(3) 


kK fSptr) :=.+ 1 + SGaeaeeeSs in Proc. CALL +/ 


J=esoto trie: (se 


s= Frame-mark ; (1) 


Frame-mark := "Stack 8[sptr]} ;o(2) 


/* De-allocation of frame */ 


Base 


endProc ; 


s= stack [Base - 1] ; (4) 


end Mech M;3 ; 
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Mech M, ; /* TANENBAUM'sS METHOD */ 
type memword = seq [..] bit; 
glovar mainmem: array [..] of memword; 
syn stack = mainmem; 
glovar sptr, Frame-mark, Pctr : memword; 
glovar LP, GP : memword; 
glovar Inst-reg : tuple 
adn zituple 
Dkex: seq. tiebits 
Offsetverseqii..)]*bit: 
endtup 
endtup; 
memword; 


privar chain, Ptr : memword ; 


Proc CALL; 


Stack slSptni Frame-mark ; (2) 


Prame-marke@si="Spte ;s(Q)MeSptus=Sptr +21 ; (3) 


Stack Sptxd Potr des (2)espes setSptraress¢. +3) 


StackstSptr lii=-LE 1; )(2eesperke=psptceted ; (3) 
/* Dynamic link stored */ 
hE Inst-req. Adr. Dilex = "Global" (3) 

Stack [Sptr] 3:= GP ye 2) Ptr <= GP; (4) 


beinst-peq. lAdr.fDlex+sa08(3)tStack [sptr] 


co 
U 


Ptr 


Ct 
'U 


|| DO chain := Inst-reg.Adr. Dlex ; (1) 
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WHILE Chain] = 0 (3 [mats i] ) 
DO Ptnes=eStacku Ber salts) (4) 
Gain <= Chains = 1 <2) (3) 


OD 


Stack Spt riass= sperm) SOEr =) Sptret 4) C3) 
/* Static link set */ 

OD /* m= lexical level difference between 
the level of call and the level of declaration 
of the procedure identifier , when the procedure 


identifier is an intermediate variable . #/ 


Gees Sptr 3-1) Ae NewaEre set ‘*/ 
Pett wi=aStack [pir sanst.regq. adr. offset] ; (5) 


EndProc ; 


Proc RETURN -; 
Sptre = serame-mark a0) 
Frame-mark := stack [sptr] ; (2) 
Petree = Stack [LP —- “6 )jm1(5))/7* iret. address *«/ 
LP seStack [GP °- 2). 7 C5) ree points to. 
Calling environment */ 
endProc; 
Proc ENTRY ; 


Stack [sptr] := Frame-mark ; (2) 


W 


Frame-mark := Sptr; (1) Sptr Soemer ine) 


Stack (sour = 40P + (2) isptr Spier Fil ye (3) 
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Stacks [(Speriliee-wGP « 42) sptr’s=sptr +1: (3) 
/* both links are stored, for uniformity in var 


access mem */ 
endProc ; 
Proc APTS: 
Sptr’ := Frame-mark ; (1) 
Frame-mark := Stack [Sptr] ; (2) 
LP ds= istackWfLP*- ti) #24) 


endProc 


end Mech My, ; 
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The virtual transfer complexity of CR and BE components 


of the above-mentioned mechanisms are now evaluated 


tira) 


2 a) 


255) 


ViGroh iM #sCR (Call-Return component as described in 
Mech M,) 

Sree 206+ tor +3 ecmee 13. 

Sean 2(nghole= poegr aia 5. 

where 6, represents the difference in lexical level 
between the point of calling Svand mipomity of 
declaration of the Procedure identifier. 

ThesmeA VAG Of UM 7 SCRiyc=@ 64 3+ 128), , where On is the 
average 6, computed from a large set of sample 
programs characterizing the programming language 
environment under consideration. 

VTC <off ‘MPJBE i(Bkock Entry Ext Component as 
described in mech M,) 

= 23 + 6 = 29 = AVTC of M,.BE 

VTC of Mz.CR (Call-Return component as described in 
Mech M2) 


= 37 + 20 = 57 = AVTC of M2.CR 


WIC of *MSOBE TCBlock! BEntry<Exrt component as 
described in Mech M2) 
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where B is the lexical level of the procedure 
identifier. 

AVTC Of: ME,CR: =C5Qu+ng See: 

where B is the average B computed over aé_ large 
number of sample programs. 


VTC of M3.BE 


=O20Rieoy Coty twiwete 7 = 33 "+ 15y [where  ¥ 
numbér of displays to be transferred from previous 
framed@and AVTG OFM? <.BE =.33 + 15 y¥. 
VTCLOf M,VCRv= 
(22 AteC6)TIFS A30S04NE( DF procedure idénbifier isva 
global variable] or 
(2204030495) +0 130=848 Pate procedure’ identifierMisva 
local variable at calling level] or 
(220RT ess + ASTOR oe 37 13)) 790 13K e542) 481055, 
[otherwise]. 
AVITGs OfeM,¢ CRE = 
35 + nx (6) + me (9) + ny (29 + 10 8y)3 
where n,x =errelativegerrequency of a ‘procedure 
identifier being a global variable. 
n. = rel. freq. of a procedure identifier being 
a local variable. 
ny = relative frequency of a = procedure 
identifier being an intermediate variable. 


a = mean difference in lexical level between the 
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level of calling and declaration for a procedure 
identifier that is declared at an intermediate 


level. 5, 1S approximately considered equal to oa as 


The relative frequencies are assumed to be obtained 
by considering all the procedure calls (static) over 
a large number of sample programs characterizing the 
programming language environment. 


heb) WTC of My. BE.= 424 \= %AVTC, 


3.2.1 Comparison of the AVTCs 
For the purpose of comparison, the following 
preliminary statistics obtained by DePrycker [Dep82b], for 


“A A 


Algol-60, are used: 6, =2, 8 =land ¥ ~ 2. 

Using these values in the comparison of M,.CR, M:z.CR 
and M;.CR , it could be observed that 

AVUTG ACM yeCR) «28S s+ .AVIGe(MoeGR ea 1 57st -AVTOR(Ms¢CRPs= 365.3 

As far as the CR components of these three mechanisms are 
concerned, M2z.CR is expected to perform best. Samrharly, 
comparingssthes AVTC's.~ ofse-theeBEvcomponents :of these «three 
mechanisms, it is observed that 
AVTC (M,.BE) = 29 ; AVTC (Mz.BE) =.24 and AVTC (M;.BE) = 63. 
Again the ,conclusion,-«is«(that(gthe oMzsBE jis. texpectedwsto 
perform best. It isp driiiculterto, make. amrabsolute 
comparison between M,.CR and the CR components of the other 


three mechanisms, as the program statistics ni,nmx,Ny are not 


available in the literature. 
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Still it is possible to make an interesting observation 
when AVTC (M2.CR) and AVTC (M,.CR) are compared. The CR and 
BE components of M, with that of M, and M3; are not compared, 
as it has already been observed that M, is definitely 
Superior in all respects. The AVTC measures of (M2.BE) and 
M,.BE) are equal. 

Lete( fertiapivmenne 
thenyaAVIGdM7s CRin=asbetr 7 iets oie gnyal29i;+ 20) 

Hosou tale Sansa th CoC hy 
We know} LAVTCO(MsR CR) r= Sieandanimstinge= ds 

From the above three relations, it is possible to 
conclude that if the relative frequency of declaring a 
procedure at an intermediate level is greater than 0.4 then 
AVTC (M,.CR + M,.BE) > AVTC (M2.CR + M2.BE) 

For choosing the implementation technique for _ the 
variable address mechanism in a block-structured 
environment, it is necessary to include the VTC of variable 
access. From the description of the four mechanisms, it is 
obvious that the VTC of accessing a variable is same for M, 
and M, and it is less than the VTC of a variable access in 
M;. The discussion on the average cost of using any one of 
these implementation techniques is postponed until a later 
section, where the cost of variable access is also included 
in the evaluation. In that section some interesting 
observations regarding the choice among the mechanisms are 
restricted to the mechanisms Mz and My, as the AVTC's (as 


well as the ARTC's, in the next section) of the CR and BE 
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components of M, and M3; have been found to be larger. 


3.3 The Real Transfer Complexity 

Given the AVTC's, the designer knows the best that 
could be derived out of the implementation techniques. Now 
the designer is faced with the problem of choosing a 
technique for implementation on a real architecture keeping 
the various design constraints and economic considerations 
TReaemat The choice based on AVTC's might not be the same 
on a real RGN ey areunane combination unless an ‘ideal’ 
range for mapping the domain of S*A constructs onto the 
hardware/firmware combination is available. (The following 
discussion assumeS a microprogram controlled processor, 
though the conclusions obtained would be valid for an 
equivalent hardwired control scheme.) As indicated earlier, 
the S*A description cf a mechanism clearly Suggests’ the 
hardware/firmware support necessary for a most effective 
implementation. Once the design constraints are clearly 
laid out the architect can easily specify the feasible 
hardware/firmware support of the processor. The development 
of a semi-automated design support through a family of 
description languages is in progress [Da082]. 

The real transfer complexity of an S*A statement is 
expressed as a matrix of three components namely, the number 
of register-register transfers (R(S,)), number of transfers 
through the ALU (A(S,)) and the number of transfers across 


the processor-memory interface (MP(S,)) required for the 
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execution of the statement S, in the S*A descrition. 


So RTC(S,)= [(R(S1)), (A(S1)), (MP(S,)) the subscripts R, 


qe]; 
A and M are used for no other reason but notational clarity. 
The RTC of an exo-architectural component X in mech M, is 
denoted as RTC (M,.X) and is ) RTC(S;), the summation 
includes all the statements in eve execution of the 
component (M,.X). The average RTC (ARTC) is calculated as 
before. 

In the S*A description of the architectural components, 
introduction of any possible parallelisms among various 
statements 1S intentionally avoided . Similarly it is 
assumed that the microprogram control of the real processor 
is of purely vertical nature. The simplistic assumptions 
are maintained so that the primary issues in the methodology 
are clearly laid out. It might be noted that the language 
S*A has adequate constructs to express parallelism [Das82a] 
and once the mechanisms include parallelism, the underlying 
firmware control (to be designed) could include possible 
parallel operations. The evaluation of VTC and RTC requires 
quite simple modifications. 

Before evaluating the RTC's of the components in 
discussion, it is necessary to propose the organization of 
the hardware/firmware combination of the processor that 
would support these components. As indicated earlier, the 
requirements of the supports necessary to achieve a RTC 


measure reasonably close to the VTC measure could be 


directly obtained from the S*A descriptions. 
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Instead of describing four different processor 
Organizations for efficiently supporting the implmentation 
of the components of the four mechanisms, Fuller's Canonical 
Processor is adopted as the basic Von-Neumann structure and 
any special requirements for each mechanism are indicated. 
The processor “organizationyis shown in’ fig. M302: Lip ARS 
assumed that any two of the registers (shown in the figure) 
could participate in a register-register transfer through a 
common bus. 

The stack 1s assumed to be in the main memory. The 
Glovar Display is mapped on to a bank of registers(DR's) and 
the privar Dis-top is mapped on to a register Dis-top which 
is used as a selector index for the bank of registers DR. 
The glovars Sptr, Pctr and Frame-mark are mapped onto the 
processor registers SP, PC (Program Counter) and FM; 
Inst-reg is mapped onto IR (Instruction Register). The 
privars LOCAL (in M, and Mz) as well as the Glovar LP (in 
M,) are mapped onto the processor register L. The Privar 
Ptr is mapped onto the register PT. The Privars stat (in 
M,), Base (in M;) and the Glovar GP is mapped onto the 
processor register G. The Privar Chain is mapped onto the 
register T. At this point no distanction is made between 


the registers that are user-addressable and the registers 


that are only used by the microprograms. Such a 
Specification is quite straightforward if the 
analysis/design is restricted to a specific mechanism. In 


any case the microcode can address any one of the registers. 
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The temporary register T, is to be only used by the 
microcode. The register ACC serves as one of the inputs to 
the ALU. ALU-2 and ALU-O are respectively an input and the 
Output registers of the ALU. For any monadic ALU operation, 
the operand is expected in the ACC. 

The choice of this particular processor structure is a 
direct implementation of the S*A description on aé_ typical 
Von-Neumann structure with a purely vertical microprogram 
control where obvious design constraints lead to the 
implementation of the Glovar stack in main memory with a 
constant address base (St-base) known to the firmware. ie 
could be noted that the processor structure is not biased 
towards any particular mechanism and thus the comparison of 
RTC's based on this processor structure could be considered 
bait. 

Examples of RTC evaluation for some typical SA 
statements involved in describing the mechanisms are 
presented below. 

1. Stack [sptr] := Frame-mark; 

Frame-mark and Sptr are available in processor 
registers. The “stack ®#is in “Mair” memory and the 
stack-base is a constant address known to the firmware 
or the underlying hardware control. Only the transfers 
involved are indicated and the transfers are tagged with 
RuelsMs ton Awetco indicate ‘the nature of the transfer. The 
transfers are: 


Stack-base ----> ALU-2 ; (R) 
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SPE 7-SABAGCCS CRYPADDL > CA) ABUaOte4—-27 MARC'S (R) 
Frame-mark ---> MDR ;(R) WRITE ;(M) 
The RTC of the statement is ia A Led 
Stack [Sptr] := Display [Dis-top] ; 
AS mentioned earlier Display is a bank of registers in 
the processor indexed by the register Dis-top. The 
transfers are: 
DisplayM(DiLS7£0p) | -=-> MDR*s 2 R) 
Stack-base ---> ALU-2 ;(R) SP ---> ACC ;(R) 
ADD ;(A) ALU-0O ---> MAR ;:(R) WRITE ;(M) 
The RTC of the statement is [5 1. 1_] 
R A M 
Sper = r7Sptr et | <: 
The transfers are: 
SP ---> ACC ;(R) INCREMENT ;(A) ALU-O ---> SP ;(R) 
The RTC of the statement is [2 1. 0 ] 
Pes §)) Nt 
Pctr ‘= 09Stack [Display ={Dis=top' + °Inst-reg. adr. 
offset] ] 
The Inst-reg. adr. offset field of Inst-reg is known to 


the firmware and appropriate mask is known to. the 


firmware for extracting the field. The transfers are: 


TR ===> (ACC 7(R) Mask (Gfisemigo--= ALUL2 ;(R) 

AND +*(A) ALU-0 ---> ALU-2 3(R) Dis-top)---> ACC™;(R) 
ADD «(A)) Display [ALU-O] =222eAL0—2 *2(R) 

Stack-base ---> ACC ;(R) ADD ;(A) ALU-O ---> MAR ;(R) 
READ. s(M)aMDR =-->0PC27(R) 


(Note: Offset field was assumed to be a right-aligned 


field’ im .cases of other fields, ‘e.g. Inst-red.. adn. 
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Lex, the SHIFT operations required are counted as _ one 
average ALU transfer.) 

The RTC of the statement is [9., 34 148 

Dis-top := Stack [Local-4]; 

The variable local is available in a processor register 


"Local' and the constant 4 18 a constant coded in the 


microinstruction. The transfers are: 


L ---> ACC ;(R) 4 ---> ALU-2 ;(R) SUB ;(A) 

ALU-O ---> ALU-2 ;(R) Stack-base ---> ACC ;(R) ADD ;(A) 
ALU-0 ---> MAR ;(R) READ ;(M) MDR ---> Dis-top ;(R) 

The ReO Of the statement, 1s [6., 24 Fe 


The RTC's for the other statementsSiare “evaluated in a 
Similar fashion. 


The real transfer complexity (RTC) and the Average RTC 


(ARTC) of components discussed earlier are as follows: 
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4(b) 


the three mechanisms M,, M2, 
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On comparing the ARTC's of the CR and BE components 


it could be concluded 
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as in the previous section, that in evaluating the average 
transfer cost of a program it is sufficient to consider the 


mechanisms M2 and M,. 


3.4 Overall Performance 

As discussed earlier, to estimate the overall 
performance of an implementation technique for an average 
program, it iS essential to estimate AVTC and ARTC of 
variable access as well. As far as the comparison of the 
mechanisms M,, Mz, M3; and My iS concerned, the AVTC and ARTC 
of variable access for Mz and M, need evaluation. 

The variable access (VARAC) component of a mechanism 
Should be composed of two separate procedures for Read and 
Write access. As the ARTC/AVTC of both the sub-components 
are identical, only the Procedure for Read access _ (RVAR) 
would be described for both the mechanisms. The declarative 
part of the mechanisms M, and My, are not repeated and one 
more system Glovar declaration for the variable Read-Value 
is assumed. In the description of the real processor 
architecture, Read-value maps onto MDR. 


The procedure RVAR from Mz.VARAC is as follows: 


procedure RVAR /* for mech M2 */ 
Read-Value := Stack [Display[Inst-reg.adr.Lex] + 
Inst-reg. adr. offset]] ; (6) 
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The ARTC (M,z.VARAC) 


The procedure RVAR for M,.VARAC is as follows: 
Procedure RVAR /* for mech M, */ 
If instareg. adr. (Dilex  =-"Globalu(3) => 
Read-value := Stack [GP + offset]; (5) 
|| Inst-reg. adr. Dlex = 0 (3) => 
Read-value := Stack [LP + offset] : (5) 
DO Chain, s=sInst-rnegueade. Diex > (1) 
Ptuhe=tStackehbRatie: (4) 
WHIGESChaine= OrG3 ioishett) 
DOsetetresa Stack beers eles) 
Chain +=’ Chain =eyeee es) 
OD 
Read-value := Stack [Ptr 
Mi nStaseg-macr. OLfset ji: (5) 
OD /* p = lexical level difference between 
the level of declaration and access 
of the intermediate var. */ 
film: 


endProc ; 


The AVTC (M,.VARAC) = 5 + 7, .8 + m2.11 + 73 [14 + 


68 


" pewp (ted a6 2, 
| oM. Aopen s “A 
emit wit’ <aieee i i 

(a) “fideo + 65) oad aE 
<a £) 0 sonia viii pane 

2) Cee? it) pee oe title 
pen. is” Jest gmand Jele: eho Oe! 
ts TSR eR - Senay 

(= al Crea a a laine neat 

; . meee. ai Aamo => 72, a | 
ala * nied is ae 7 


es 
Ft 


‘i 


oe i 

159] <fh: ms meson 

' 3 [jset7t0 15 O6 oo he aa as ; TON 
i n ° i . 
isawaed Sate tstiio Tavell 


pi 


eaenos bas nei eis 


69 


where 71, "m2 and m3 are relative frequencies of global, 
local and intermediate variable access respectively.6, = the 
mean lexical level difference between the level of 
declaration and level of access of an intermediate variable 


computed over a large number of sample programs. 


4. 24 ay 
10 3 1 
ARTC(M,.VARAC)=[1 71 n2 73 ] 5 A M 
13, 34, iF 
(2274010 ep (it Bd. G0 FF Le 


For comparing the two mechanisms, as far as variable access 
is concerned, the preliminary, statistics available rn 
[Dep82b] are used. The statistics are: 


aN 


EALECOLSSpaRs =S0'.28: hy, Ge tOes ee eandes tes 01.0. 


On substituting, ARTC(M,.VARAC) = [22.3, 6.5, 1.7] 
& PAVTIC (MNEVARAC) @2m3.077 
DePrycker's Program activity characterization model 
would be used to evaluate the average cost of uSing an 
implementation technique [Dep82a]. The average VTC of a 


block structured program uSing the technique proposed by 


Rohl, could be represented as 


= + : 4 . 
AEE Ie ie AVTC (P(M2)) ny 24 + eee? n 6 


‘The term AVTC (P(M )) stands for Average virtual transfer 
complexity of an average program activity using mechM . 
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where n) = mean number of Block Entry-Exit in a program 
es mean number of procedure Call-Return 
n= mean number of variable accesses. 
A comparison of the VTC of an average program execution 
uSing the two techniques indicate that 


AVTG .(P (Mz) )¥< AVTC. (P(M,)-) 


provided - 


24 n+ 57 s he 6 ining S” 2 ana WAS get. Sey) Ny 


or ny << 0.35 (considering n,/Ny to be a small positive 


tome beSis T on Ny, 


Quantity) . 

[The preliminary statistics obtained in [Dep82b] from 
the nine numerical programs for digital filtering and speech 
recognition are used for these evaluations. ] 

It should be noted that the observations regarding the 
effectiveness of using either of the techniques (M2 and M,) 
remain unaltered . 

It is extremely difficult to predict the effectiveness 
of either of the techniques in terms of ARTC , unless’ some 
reliable statistics are available regarding n ,n ,n and ny 
ARTC(P(M,)) - ARTC(P(M2)) = 

nig: BS aSiany -11.5) .(9ny - 5) (5 none ) ye 
+ A abe 0. 1,2 + n,[10.2, - 0.8, 0.7, J- 
The average execution cost of a component could be 


obtained by multiplying the ARTC matrix by a cost matrix 


( 1W, W.) , where W, and Wz are the cost of an average 


ROTO aoe hts aecond ange Hd: 7 
npudaet ta ote Peputey de: 3 Ogi 

vddecasns »lastier so naa 

ngidioexs meseoiguspeyays na 29 OPV sxsttd »® nema k 
foty roca ) soup baiae oun. og 

| _Mit) SRA 


Ct Mia) DIVA > ol 


(t Aa ‘ j + = z o ~* a } T mes ” 8 


ay isi eec site Ss \ Sch oe Pi. ch pebpeiow) -Siue0 


iSaged) af, Gsiniesdo ent teagare 5 wa anems Sane ont) (x3 


eae: i 


iosae2 Eos ontzsibi® fesieib 10? eee Loot youd: aaia's 
,anoissvlivs sa#ids FoR Beav ese weiss : 

=) ee © pesawes anoryaeyigedo fy tists Kaden: ‘sd bivnite 4X 
4) Seyestised eds -aetnbet wen to sesnevisaeiia 

be t9l Len abe . 

adensvisdsiie si? Jarbseap.¢3 1 Lust 39 OR tees S48 ai. SRY a 
“He i , OTRAr 2oremass i apuphtiawe on! 1i@ Sedtie to: 
pas Wr, yon ont s78OS ie ldol av 5 aUp, GS se ives sidelier 
y 


| (mw TRA - ¢ 


a a 


» van : 
Cult) ‘ fi 


Trt 
i 
> 
i 

4 
< 
cf 


at pbives 7NSeoqgnes 5 aa 7209 folaucere SP al19vA8 eT 
sm’ ieoa s yd xlagem OPAA afd babytqtilum gh Beningda 


SpOte’s “NB 16. F205 “Sti, ere gh. She sh etedy « ty yh ) 


ce 


transfer through the ALU and the cost of a _ register-memory 
transfer respectively. W, and Wz are the values normalized 
with respect to the cost of a transfer between two 
registers. The evaluation of the cost includes’ the 
technology dependence. 

It has been shown in this chapter that it is possible 
to establish a methodology for choosing the 
hardware/firmware Support for the variable addressing 
mechanism in the block structured HLL environment. The 
methodology eould be extended bor other Similar 
architectural components. 

The motivation underlying the study was to choose one 
of the four well known addressing techniques best-suited for 
design of the proposed architecture directed towards the HLL 
Ada. In absence of usage statistics for an Ada environment, 
the available statistics for Algol-60 were used. It was 
Surprising to discover that the not so well-known technique 
Suggested by Rohl yields such an effective mechanism. It 
might be noted that Rohl's technique is quite suitable as 
long as formal procedure parameters are not taken into 
consideration. It could be considered almost -as good as 
Tanenbaum's technique for the Ada environment as Ada does 
not allow formal procedure parameters. In absence of any 
reliable statistics for yn, for Ada, it was not possible to 
come to a definite conclusion regarding the choice between 
Tanenbaum's and Rohl's techniques. But considering some of 


the preliminary statistics in [Dep82b], eg. B= 1 and ae =e 2G 
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Chapter 4 
An Overview of the Proposed Architecture 

The proposed architecture could be broadly 
characterized as a stack processor with tagged memory and 
capability based addressing and protection mechanisms. The 
rationale behind the use of capability based addressing and 
protection mechanism have already been established. In 
Chapter 2 the advantages of using a tagged memory approach 
and especially the advantage of using tagged capabilities 
for efficient domain switching have been explained. This 
chapter includes the rationale for choosing a stack oriented 
instruction set for the processor. 

In presenting an overview of the architecture, the 
memory organization will be briefly indicated without any 
details of policies and mechanisms for virtual memory 
management. The primary emphasis in this chapter will be on 
the processor organization. The capability mechanism will 
be explained in detail with respect to the adopted principle 
of domain switching and implementation of abstract data 
types. But the discussion will not include details of the 
capability mapping mechanism or management -of longterm 
capabilities in the secondary storage. 

The chapter is organized into six major sections. The 
memory organization is introduced in the first section. The 
second section deals with the rationale behind the choice of 
the processor organization. The problems of protection ina 


Stack processor organization are discussed along with the 
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proposed solutions. The organization of the process’ stack 
segment is discussed in detail and the necessary 
hardware/firmware support for management of the process 
Stack are specified in this section. The details of the 
capability mechanism are presented in the third section. 
The . changes in the structure of the process stack segment 
due to the inclusion of protection domains and the 
Capability mechanism are explained in this section along 
with the corresponding additional architectural supports. 
The fourth section includes the architectural support 
provided for dynamic type checking, parameter passing, 
accessing and representation of dynamic objects. This 
includes explanation of the representations of various 
primitive data types in the architecture. The principal 
object in the architecture, the packet object, is introduced 
in the fifth section. The details of the representation of 
this object and the interactions of the various fields of 
the packet with the activation records in the process stack 
are explained. The final section includes the description 
and explanation of the architectural Support for 
implementing abstract data types. It could be observed from 
the above discussion that various architectural features are 
introduced in separate sections and the processor 
organization is systematically developed through the 


chapter. 
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4.1 Memory Organization 
The virtual memory space is essentially visualized as 
three distinct sections: 

a. Stack space, 

b. heap space and 

c. Packet space. 
Another permanently resident segment for some resident 
Operating system code for supporting memory mapped I/O and 
directory space for capability mapping information could be 
reserved in the Primary memory. (Details of memory 
management are not included in this thesis). 

The stack space consists of segments representing the 
process stack segment of a process. Primary constituents of 
a process stack segment are the domain frames that include 
activation records for each domain. Further discussions on 
the domain frames and activation records of a process’ stack 
segment are included in sections 4.2 and 4.3. The process 
Stack segment iS protected by the capability register 
GRpidis 

The heap space is meant for objects that are defined 
and created at run time of a program. Typical examples of 
Such objects are dynamic arrays, discriminant records and 
accessed variables. Only objects that could be well defined 
at, compiles timeatbut * st ph lioecupystheap ® space are | “the 
composite objects (records, arrays, etc) that are returned 
as results of function subprograms in Ada. When an _ object 


is *#sereated= bys a process» in o’the:*heap.ispace,. ‘a’ reference 
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pointer (a capability) is returned and stacked on the stack 
segment at the appropriate place. Thus every object in the 
heap space have unique system-defined names and are 
protected through the capability mechanism. 

The packet space in the memory is allocated to packet 
objects. The description of the contents of the packet 
object 1S postponed until section 4.5. The packet object 
encapsulates a separately compiled module of Ada program, 
1.€., one or more subprograms, package or a task. Every 
packet object is protected by means of a capability. 

The primary memory is word addressable. A preliminary 
Specification of the word length is 64 bits which includes a 
4 bit tag. The two hardware registers associated with the 
primary memory are the memory address register MAR and the 


memory data register MDR. 


4.2 The Stack Processor 

As noted earlier the proposed architecture could be 
broadly categorized as a stack machine. It 1S necessary to 
justify the choice of a stack oriented instruction set for 
an architecture directed towards Ada. A study -of compiler 
systems for all the known programming languages, for the 
conventional register oriented architectures, reveals that 
no compiler treats the available hardware as a monolithic 
resource. Instead, some basic run time model for _ the 
architecture 1S adopted. The implementation techniques 


presented in the literature for any Algol-like language 
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invariably assume a stack oriented machine as the abstract 
target architecture [RaR64, Gri71, Gri74, BaC79, Bj0O80, 
ShS80, Bar81]. The need and desirability for such an 
assumption has already been well established. Hence if the 
architecture under consideration represents a stack oriented 
instruction set, the need for simulating a run time model 
and thus generation of extra code for simulation cease to 
exist. There does not seem to be any disagreement among the 
computer architects regarding the usefulness op: 
architectural Support for efficient implementation and 
handling of activation stacks [Mye82, Dor79, Sto80, 11182]. 
Although the basic run time model for any Algol-like 
language incorporates support for implementing activation 
stacks (for implementing scope rules, procedure calling 
conventions, etc.), that does not necessarily imply that the 
expression evaluations have to be done on the stack, through 
the generation of reverse polish code. Although efficiency 
consideration for expression evaluation iS apparently a 
mundane issue, yet it iS an extremely important issue. All 
assignment statements, IF and CASE statements, most DO 
statements and certain other statements in any procedural 
language involve expression evaluation. Studies indicate 
that 50%+ of written high level language statements and 75%+ 
of executed statements involve expression evaluation [Kee78, 
Tan78, Els76, Mye82]. Thus efficiency of expression 
evaluation significantly affects the code space requirement 


and execution efficiency of an architecture. 
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Two major studies on this issue of code space 
requirement and performance analysis of various instruction 
forms (with respect to expression evaluation) are due to 
Harvey Cragon [Cra79] and Glen Myers [Mye82]. Cragon's 
Study 1S quite realistic and he observes that when a 
consistent set of assumptions are applied to the five 
instruction forms (2/3 address memory-memory, One Address 
Accumulator, register-register and 0/1 address 
Stack-in-processor), no significant differences are found in 
either code space or performance. The conclusion is not too 
Surprising from the information theoretic point of view. 

Myers approaches the problem ina slightly different 
way in the sense that he measures the S and M measures for 
the above mentioned instruction forms by considering seven 
different expressions and their corresponding frequencies of 
occurrence in high-level language programs. The analysis 
indicated that 0/1 address stack-in-processor form is 
marginally inferior to 2/3 address memory-memory form with 
respect to the M-measure. In the context of the S measure, 
his analysis shows that the 2/3 address memory-memory form 
is definitely superior to the 0/1 address stack-=in-processor 
form. The technique of the analysis is positively biased 
ands iunrealistve Sin othe® sense that unless realistic 
instruction streams are considered, 0/1 address stack form 
would require extra load-store sequences that are not really 
necessary. A similar argument holds for register oriented 
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Based on the results of these studies, the conclusion 
is that the choice between the instruction forms 
(stack-in-processor, 2/3 address memory-memory) should not 
Significantly affect the code space requirement or execution 
performance of the processor. The other advantages of a 
Stack processor for an Algol-like environment have been 
mndicated earlier, ‘and *thuswit e:was decided that a stack 
instruction set would be the most suitable for an 
architecture directed towards Ada. 

Myers [Mye82] points out in his report that though the 
Stack-in-processor form seems to be reasonably attractive 
for expression evaluation, yet it is unrealistic to have the 
complete stack in the processor. A study by Tanenbaum [Tan 
78] indicated that 99.7% of expressions in a typical program 
do not require more than four operands on the stack. Thus 
it iS not unrealistic to assume that four top of the stack 
locations exist in the processor. Management of such a 
stack top does not really complicate the situation [Bla77, 
Den80, Sto80]. Moreover the use of a stack cache ([Dit82] 
and instruction look ahead principles [Dit80] for stack 
machines have been adequately established in the literature. 
In the present proposal, four top of stack locations are 


considered to be available in the processor. 


42 il The Salient Features and Problems in Stack 


Architecture 
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Now that the suitability of a stack processor is 
established, the important features of a stack architecture 
will be summarized. Some of the problems associated to 
Stack architectures will also be indicated along with the 
proposed solutions. 

Some of the important features of the stack 
architectures are the following: 

a. It allows easy implementation of block structured 
languages like Ada. 

b. The usual register allocation problem for temporary 
variables is absent as the activation record of a 
procedure or a block provides automatic allocation 
of temporaries and local variables on the activation 
stack. 

c. As indicated in (a), the stack architecture provides 
a imaturahb: csupport stongralhbocabidcn<cesof procedure 
activation records and thus parameter passing 
becomes extremely simple. Similarly the protection 
domains of a process could be allocated on the 
process stack segment as domain frames. Thus 
Capability passing across domains during domain 
Switching becomes as straightforward an operation as 
that of parameter passing. This. ipoine twaklabe 
further elaborated in the next section. 

da Shhepnowides ilocaliay éofere ference peas: Cmostesoint the 
operands and parameters are available in the stack 


segment. The stack segment of a process is 
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protected by a capability that is made available in 
a predefined register at the beginning of a process 
and thus virtual address translation of objects 
existing in the stack segment does not require any 
additional overhead. 

e. It allows stack relative addressing modes in which 
the necessity of explicit specification of the 
segment address (i.e., the capability for the 
segment) does not arise, thus yielding shorter 
instructions. One the*®other hand at provides 
compaction of code space through zero address 
instructions (where the top two elements of the 
stack are the implicit operands). 

The concept of 'own' variable (OV) [Pra75] poses 
allocation problem in an stack architectures. Own variables 
declared in a procedure cannot be allocated on the 
activation stack as the variable should exist between 
activations of the procedure. In Ada, as mentioned earlier, 
variables declared in a package are to be treated as ‘'own' 
variables. The problem is solved in the proposed 
architecture by allocating 'own' variables, that are scalar 
or statically determined composite. structures, in the OV 
space ein! #theatipacket / fobzect. The own variables that are 
dynamically determined composite structures would be stored 
in the heap space with references to the heap objects in the 
OV space of the packet object, representing the Ada package. 


The solution is explained in detail in section 4.5. 
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Similarly, stack architectures do not provide any 
Simple technique for implementing abstract data type 
mechanisms. The problem is not due to any deficiency in the 
stack mechanism but due to conflicting requirements of 
"information hréing? and scope/visibility rules in 
block-structured languages. An elegant solution to the 
problem is presented in the proposed architecture through a 
combined interaction of the capability mechanism, the stack 
mechanism and the packet object. The details of abstract 
data type implementation is given in section 4.6. 

A couple of protection problems are also associated 
with the commonly known stack architectures. One of the 
problems is linked with the display-relative addressing for 
block-structured languages. The concept of display relative 
addressing was discussed in Chapter 3. The protection 
problem associated with this addressing mechanism is that 
there is no check on the offset value that could be added to 
the contents of the display register for accessing an object 
in a particular lexical level. As there is no check on the 
limit on the offset value, any address on other lexical 
levels could be generated, although this is illégal from the 
pointeof Visibidity? sini vbloeksstrectured “languages. This 
problem is handled in the Burroughs series of machines 
through certified compilers [Dor79]. An easy and natural 
architectural solution to the problem is to provide a limit 


field in each display location. 
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Some machines allow generating addresses relative to 
the 'top-of-stack'(TOS) and the address is checked by the 
base bound pair for the whole stack segment. Obviously such 
explicit TOS relative addressing should not be permitted in 
any machine instruction (it could be done at the microcode 
level). Similarly most of the commercially known stack 
architectures do not restrict the POP instruction and zero 
address instructions to only the temporaries. Usually the 
issue of protection of the administrative information, 
parameters and local variables are relegated to _ the 
certified compiler. The proposed architecture takes a 
different stand and demarcates the base of the temporary 
area through a TB register; tags are provided for words of 
AIR area and the above mentioned instructions are only 
allowed to operate on temporary operands. 

An important problem in all stack machines is the 
problem of ensuring the integrity of the procedure calling 
sequence. Again this issue iS not usually considered as a 
problem as machine language programming is not allowed on 
most of these machines [Dor79, Org73] and the compiler 
generated code maintains the integrity. As indicated in 
Chapter 3, the proposed architecture takes care of the 
problem by microcoding the call Sequence for procedure 
calling without parameters. For procedure calling with 
parameters, a different instruction pair is used (PBEGIN and 
PCALL). The details of the architectural support for 


calling Ada subprograms with parameters are provided in 
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section 4.4. The scheme requires PBEGIN to precede a PCALL. 
The requirement is satisfied by a status bit at the 
microcode level. The microcode sequence for PCALL requires 
the status bit to be set (the status bit could only be set 


by a PBEGIN) for a successful execution. 


4,2.2 A Preliminary View of the Stack Segment 

Figure 4.1 presents a preliminary view of the stack 
frame and shows’ the various hardware registers related to 
the management of information in the stack segment. The 
view 1S preliminary in the sense that various other features 
and architectural supports will be added on to this view as 
we proceed through the chapter. 

In Chapter 3, it was demonstrated that how the 
implementation technique chosen for designing the 
architectural support for variable addressing affects the 
execution efficiency of the architecture. In that chapter 
it was shown that the implementation technique based on 
Rohl's proposal was the most suitable one for designing the 
architectural support for variable addressing for an 
architecture directed towards Ada. Hence the view of the 
stack frame developed in this section and the corresponding 
hardware supports shown in Figure 4.1 are directly derived 
from the conclusion in Chapter 3. 

In figure 4.1, the view of stack segment shows three 
activation records. The most recent activation of a _ block 


Or procedure is represented by the top most stack frame. 
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The pointer to the base of the top most stack frame is 
contained in the register FM. Similarly the register TB 
points to the base of the area in the stack frame allocated 
Fou @storing of temporary variables and expression 
evaluation. The lexical level index of the last activated 
block or procedure body is contained in the DiS =fOR 
register. The corresponding display register 
[Display[Dis-top]], in the bank of display registers, points 
to the base of the area in the latest activation frame that 
is allocated to the local a ee of that activation. The 
concept of display register oriented addressing has already 
been explained in Chapter 3. The area allocated to the 
administrative information for return/exit (AIR area) 
contains information that is dependent on whether the frame 
under consideration represents a block or procedure 
activation record. If the frame represents a Procedure 
activation record then the AIR area contains the link, 
return address, previous Dis-top, previous FM and _ the 
previous” TB. In case of ayblock activation record all the 
above information except the return address and the previous 
Dis-Top are present. The precise meanings-of these AIR 
contents have been explained in Chapter 3 in the context of 
Rohl's mechanism (M2). In the next section, once the 
capability mechanism in the architecture is introduced, this 
preliminary view of the stack segment will undergo 
substantial change to include frames for protection domains. 


Similarly after introduction of the parameter passing 
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Fig. 4.1 : A View of the Stack Frame 
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mechanism, the structure of the procedure activation record 
will be modified to reflect the necessary changes. 
Additional hardware/firmware supports will be introduced as 
and when necessary in the process of stepwise development of 


the complete processor. 


4.3 The capability Mechanism 

Before getting into the finer details of of the 
capability mechanism, in the proposed architecture, it .is 
necessary to present an overall picture of the capability 
based addressing and protection as adopted for this design. 

It was indicated earlier that any separately compilable 
unit in Ada (a subprogram, package or task) is represented 
in the machine level as a packet object in the packet space 
of the memory. A packet object consists of pure (reentrant) 
code corresponding to procedures, function or package 
initialization routines and some reserved areas for 
descriptor templates, own variables as well as capabilities. 
So an Ada process executes code segments of various’ packet 
objects associated with the process. In this architecture a 
packet object relates to a specific protection domain. 
Hence an Ada process could execute in a number of protection 
domains. There also existS a one-one correspondence between 
a packet object and section of the process stack segment 
named as a domain frame. 

The activation records related to a packet object 


(corresponding to procedures and blocks) are contained in 
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the domain frame associated with the particular packet under 
consideration. The stack segment of a process and every 
packet object is protected by the capability mechanism. A 
capability (in other words, a unique system-defined name) 
would be associated with a packet object when it is created 
by a process. The process stack segment gets associated 
with a capability when the corresponding process is 
initiated. Further details on the packet object and the 
interactions between a packet object and the process’ stack 
segment are postponed until section 4.5. 

In this proposal, capabilities name and control access 
to objects in the virtual address space. The capability — 
associated with an object represents a unique system-wide 
name for the object irrespective of where in the system the 
object is BGEnen ae ocared and regardless of which process 
uses it. The actual representation of an object is not 
defined in the architecture (as in SWARD [Mye82]). Some 
typical objects that are associated to and identified by a 
Capability are - process stack segments, packet objects, 
simple and composite structures in the heap space. Each 
object in the system is assigned the unique name at the time 
of creation (at run time). 

As the addressing mechanism of the architecture is 
built around the concept of capability based addressing, 
discussed earlier, the virtual address of an operand is 
considered to be of the form (capability, offset). It would 


be seen in the discussions to follow that often it will not 
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be necessary to explicitly specify the capability component 
of the virtual address. The form of virtual operand address 
depends on the type of instruction referring the operand. 
For the stack oriented instructions, referring to operands 
in ‘thee process | stack segment, the virtual address 
translation mechanism would automatically assume the 
capability of the process stack as the implicitly declared 
capability in the operand address. Similarly instructions | 
Specifying address references in the instruction space of a 
packet object would be implicitly specifying the capability 
associated with that. packet. Capabilities have to be 
explicitly specified only for the operands existing in the 


heap space. 


4.3.1 The capability Representation 

Before describing the exact form of the capability 
proposed in this architecture, it is necessary to enunciate 
the rationale underlying the design decisions. Consider the 
requirements of the capability representation: 

1. It should represent an unique system wide name for an 
object (process stack segments, packets and. objects in 
the heap space). 

2. Access rights for operations permitted on the object 
should be specified in the capability representation. 

3. The capability representation should be able to uniquely 
identify a protected procedure in the packet object. 


(It is particularly necessary for implementation of 
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abstract data types in Ada.) 

4, Similarly it should be able to name and restrict access 
to a word within a composite object without exposing the 
complete object. This requirement arises while passing 
element of an array or a record as a reference parameter 
or while passing a word in the stack segment across a 
domain frame. A further restriction regarding a 
capability identifying a word in the process” stack 
segment is that the capability should not be allowed to 
be copied out of the stack segment. 

5. The name/identifier part of the representation should be 
protected against any modification. It should be 
possible to modify the access rights to the extent of 
deleting certain rights but no amplification should be 
permitted. 

The first requirement implies that enough bits should 
be allocated in the capability representation to uniquely 
name objects created throughout the lifetime of the system. 
There are various suggestions in the literature regarding 
the number of bits required for the unique object 
identifier. Some of the suggestions are: 

Linden - 50 bits [Lin76], Myers - 30 bits [Mye82], Dennis - 

37 bits [Den80]. 

Dennis has shown that even with a highly inflated 
assumption of 36 objects being created per second, a 37 bit 
identifier should be able to provide unique names for 125 


years! In the present design 36 bits are reserved for the 
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capability identifier. 


The access authorization field would have five bits 


representing the five different authorizations -'Read', 
"'Write', “Enter, LCOpy* and "Destroy'. A process 
possessing a capability for a segment with "Read' 


authorization would only have the authorization to read the 
segment pointed by the capability identifier. Similant ius 
the 'Write' authorization. The 'Enter' authorization does 
not carry its usual meaning as in the C-list oriented 
capability architectures [Den66, Eng72, Nee74]. The 'Enter' 
authorization could only be present in a capability 
thatnames a packet object. The presence of 'Enter' 
authorization in a capability allows the process possessing 
the capability to switch its protection domain to execute in 
the packet specified by the capability. The LCOOy 
authorization in the capability allows passing a copy of the 
capability across domain frames. The "Destroy' 
authorization implies the authority to destroy the segment 
named by the capability. The capability representation for 


this architecture is shown in figure 4.2. 
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tag form authority identifier offset or 
index 


Figure 4.2 The capability representation . 


The requirements 3 and 4 introduce the necessity of 
having a 'form' field in the representation. This field as 
well as the ‘Authority’ field are not encoded for execution 
efficiency. The interpretation of the form field is as 
follows: 

forma: £001 ‘s  '¢apabrlitysfor-an iobject segment 

and the offset field is to be 
ignored. 


Corman: 110 10 


ee 


capability for a word in any 
segment other than process stack 
segment and the offset field 
represents the offset of the 
word in the segment. 


form 100 : Similar to 010 but the named 


word 1S in the process stack 


segment (cf. requirement 4) 


Another form is 'all zero' to represent the capability for a 
packet, the presence of 'Enter' authority is mandatory to 
execute instructions in the packet. For masthis9g formhawith 
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point index in the header section of the packet object (in 
this case the field is not encoded and should only have 
singde Sh -at «ae htrme) . Thus it is possible to have 16 
protected entry points in the packet object. Futher details 
of the representation of the packet object are given in 
Seceron. 575% 

Thies capabilities: Mom Gtnemcoums ' 010’, *100° >and) 000" 
could only be created by a process having a capability of 
ehes“forme /roon. The creation of these three capability 
forms from the '001' form is handled by the COMPUTE 


CAPABILITY instruction. 


4.3.2 Capability Mapping Mechanism 

The proposed capability mapping mechanism and register 
Support structure (in the next subsection) for capability 
based addressing bear resemblances to the corresponding 
architectural features in the Plessey 250 system [Eng72] and 
a capability architecture proposed by Dennis [Den80]. Thus 
details of the mapping mechanism or long term management of 
capabilities in the secondary storage will not be presented. 
Only so much detail will be provided as 1S necessary to 
explain the remaining new features of the architecture. 

To implement capability addressing, it is necessary to 
map the virtual addresses (the capability identifier, offset 
pair) into primary memory addresses. The mapping mechanism 
could be very similar to those proposed for mapping segment 


descriptors in segmented memory management systems 
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[MaD74,Wil72]. A table is needed in the secondary storage 
to relate the capability identifiers to the objects stored 
in the backup secondary storage. As the table would be 
extremely large it could be represented as a tree of tables. 
A table is also required in the nonrelocatable part of the 
primary memory to contain information relating capability 
identifiers to primary memory addresses. This table will be 
referred to as the PMT (Primary memory table). The primary 
memory addresses will be present in the (base, limit) form. 
An entry in the PMT would contain the following fields: 
a. the capability identifier, 
b. the 'base' of the object in primary memory, 
e. ithes'’limit'vor the length ifofethe dobject; 
d. some bits necessary for memory management and 
replacement polices and, 
e. a pointer to the corresponding entry in the mapping 
table in secondary storage(to be used when segment 


fault occurs for the object linked to this entry). 


Thus the PMT is used to map capability identifiers into 
primary memory addresses and the PMT would keep a record of 
the segments (objects) currently being used by a particular 


process executing in the system. 


4.3.3 The Set of capability Registers 
The architecture includes ten capability registers 


CR[1] - CR[10] to support the capability mechanism. These 
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registers would contain the mapped form of the capabilities 
used in a specific domain. The set of registers is always 
associated with a particular domain of protection. The use 
of registers for mapped capability information is motivated 
by the desire for short addresses and fast referencing of 
capabilities that are often used. It was indicated earlier 
that when a capability is used for addressing, the mapping 
mechanism is invoked for retrieving the primary memory 
address (base-limit pair) from the PMT entry corresponding 
to the capability identifier under consideration. Tetars 
obvious that it would not be wise to use the mapping 
mechanism for generating each and every address. The 
capabilities that are to be used frequently, are loaded into 
capability registers (in the mapped form). Lnegmenis 
architecture every address specifies a capability either 
implicitly (é$on picapabiditysfommiprocessstackcword,. packet 
object) or explicitly. The descriptions of the operations 
involving the process stack segment with domain frames and 
association of the set of capability registers with a domain 
frame are presented in the next section. 

The format of the contents of a capability- register is 


shown in Figure 4.3. 


Figure 4.3 Capability register format. 
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The instruction LOAD CAP REG loads aé_e specified 
capability into the register indicated in the instruction. 
The LOAD CAP REG instruction invokes the capability mapping 
mechanism to retrieve the capability from the PMT and loads 
the base-limit fields of the PMT entry into the specified 
register ~filtertatsogdlcoads (themweAuthority™® fireldocefiw the 
register from the capability and sets the load indicator (L) 
on 2 cThere /icould ovube pasome ®| more ebitsncint thet ccapability 
register for management and replacement policies. 

Of the ten capability registers provided in this 
design, CR[1] to CR[8] could be loaded and used by compiler 
generated code for user programs. The registers CR[9] and 
CR[10] are reserved for specific system use. The register 
CR[9] gets loaded with the capability for the segment, 
representing a packet, from which the instructions ina 
particular domain would be fetched and executed. The 
content of register CR[9] would get changed when domain 
Switching takes place via an ENTER instruction. The program 
counter is always interpreted relative to CR[9]. The 
capability register cCR[10] is always loaded with the 
Capability representing the process stack segment. It is 
changed only when the processor switches from one process to 
the other. This thesis essentially deals with a single 
process and thus any changes in the content of CR[10] would 
never be encountered. 

It should be observed that only the machine instruction 


LOAD CAP REG could write into the registers CR[1] to CR[8], 
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that “alsoe by “an Sexplicite*specification® ofa capability 
stored either in the packet or the process stack segment. 
There is no way in which any user-generated instruction 
could modify a capability, other than deleting some of the 
access rights (authority). 

In this architecture, the execution of a process in 
various protection domains is synonymous to the execution of 
a process in various packets. This concept introduces an 
interesting notion of user-defined domains of protection. 
The term ‘user-defined’ is used as the user could define the 
contents of a packet object. The number of subprograms 
constituting the instructions in a packet object could be 
controlled by the user through separate compilation. This 
feature could be naturally visualized in the context of Ada, 
as Ada presents well defined means of separate compilation 
[GoH80]. 

When any procedure or block in a packet is executed, an 
activation record is created on the stack for the particular 
invocation. To define an encapsulation of the activation 
records of a specific packet object, the concept of domain 
frames in the process stack segment is introduced. The 
introduction of domain frames in the process stack segment 
changes the preliminary view of the stack frame presented in 
section 4.2. 

It waS mentioned earlier that the contents of the set 
of capability registers CR[1]-CRI9], at any instant of time, 


are uniquely related to a specific protection domain i.e., a 
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domain frame on the process” stack. To implement this 
relationship, a specified section in the beginning of the 
domain frame is reserved for the nine capabilities that 
could get loaded into the set of capability registers. When 
a capability register is loaded, the corresponding 
capability is stored in the reserved location corresponding 
to the particular capability register under consideration. 
This reserved area in the domain frame would be referred to 
as the capability register backup area (or CRB). Locations 
in the CRB area CRB[i] would represent iey a pa aikke, whose 
mapped version exists in CR[i] (for i =1...9). It should be 
noted that information in CR[i] is valid only when the load 
imdzeatoruiin cthe \aregqister#i SHON. tel :CR[ii]) contains walid 
information, the semantics of LOAD CAP REG instruction 
implies that CRB[i] contains the original capability that 
has been processed by the mapping mechanism and the mapped 
Vermsionm(ciatfigures4.3 hexists “inuGR[ i). 

At the time of domain switching, the contents of the 
capability registers need not be stored in the new domain 
frame. It is sufficient to store the information regarding 
the s.status @of ‘the L” indieatorsizifor! teach ixef ) there 
registers. While returning back to the previous domain, the 
status of L indicators are checked and corresponding 
capability registers are loaded from the information stored 


in the CRB area. 


| re ie ee 
int snetein> We Latest eisai 7 
afd: 30 satortram pany. 4) ont eoee tothh eons 
sic -eoytd t Lethe eel aay - 104 Sevigses aes “ p 
ode; ,2reitiper Cbitidage.s So ¢ea- mee Bees. 


ree 1.12 Bao: ii 3GG0l°719. 39+ ane, eg fewes€ eb 
Riatee + Vals Rte aivatieeg 


~~ ren 
YP) 


feet 
r 
=, 
a 
y 
“ 
, 
¢ 


ee . "% = 


Satrya2e? ‘eb btuew erat nist els ng wend | ai : 

+ Bao io) “SST tilts ae toes yale 362 :, | 

eel 'y Piiasuey.stt stéeo-4s. ‘Aleit NAD: waren tee + 

biwerrta<d.! (Ocypt< baoth 47 ihm edeine asic: 

it. astw wing hilev.e. Litpa CETTE, tent 

bidev enisvras beige 42. .40 21 caren ong we * 

‘62. ya's Lf adn oa GeO? 2c on eremeR” ett a9 Sanne 
‘sa3. voi lidrene T Seri par sc ee il fajean ado ae sie 

MIG & 3h Sue akan fos. Cr 4am ot ‘a ee 3 

| PY ns ae ieuk rings? ta) hdl 

17. 3 27 enes 7 aoe (Dieta: val ea (TK 922 a) th | 

Bre. ea oH 4 >in ‘Bareds Bet. tis howe), misieines « wisn | 

p A teps it LF rr: LY” en @2R at awkoisqueder tty 
sia $03 280 sie 363 ot9 1 6SS0RE: a ons 23 

' 1h hatin dobowag ‘sit v ss pace cetamtaws ot phi 

Calbnoneesdos, tak Psicar IaB haadesthat I 22 

yet698. “i taamsetne iota mor? fehwel ere.229: ‘Ress wk 


99 


4.3.4 Introduction of Domain Frames in the Process Stack 

A modified view of the process stack segment is shown 
in figure 4.4. The modifications to the earlier version 
presented in figure 4.1 are due to the introduction of the 
notion. of protection domains inthe. -architecture. The 
necessary hardware added on to the previous view for 
Supporting this feature are indicated in the figure 4.4 and 
will be explained in this section. 

The stack segment shown in figure 4.4 consists of three 
domain frames. It indicates that Othe process under 
consideration has already executed in two packets and has 
entered the third packet pointed to by CR[9]. A domain 
frame would consist of aS many activation records as the 
number of procedure or block activations during the 
execution of the process in the corresponding packet. The 
third domain frame contains three activation records. The 
CRB register points to the base of the CRB area of the 
top-most domain frame. 

The major modifications in information content of the 
AIR region of the first activation record in a domain frame 
is quite significant. The additional information in the AIR 
of the first activation record in a domain frame as compared 
to other AIR's (given in section 4.1.2) are the following: 

a. contents of CRB register of the last domain, 
b. complete display register bank and 
c. L indicators of the nine capability registers from 


the previous domain. 
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Figure 4.4 : A Modified View of the Process Stack 
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The emptying of the display register bank during the 
domain switch prevents any access of activation records in 
the calling domain from the newly entered domain. Thus 
lexical level addressing implemented by display registers 
deals with the top domain frame only. Similarly tags on 
every word in the AIR [cf. section 4.5] allow only’ the 
RETURN group of instructions (RETURN,PRETURN,FRETURN) to use 
the contents of the AIR area, and prevent any other 
instruction from accessing information in the previous 
domains. 

A new domain is entered by executing an ENTER 
instruction. The ENTER instruction requires a capability 
(with ‘enter' authority) for the new packet object to be 
entered. The ENTER instruction expects the capability to be 
onesatoperoie the stack? So the calling domain prepares a 
capability by COMPUTE CAPABILITY instruction before issuing 
an ENTER instruction. The COMPUTE CAPABILITY instruction 
requires a capability (with ‘enter' authority) in the packet 
Sbjecwrecontalning= the “instwuction -and@tay 16) biteqvalue 
representing the entry point on top of the stack. On 
successful execution it leaves a capability with entry point 
index specified in the index field of the capability. In 
addition to storing the required information in the AIR and 
Pesettingtthe ‘capability registers CRI} to»s7cR[9] of» -the 
first activation record of the new domain frame, the ENTER 
instruction does the following: 


1. reserves space for CRB area, 
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2. loads CRB[9] with the capability specified in the ENTER 
INSteUucta ons 

3. invokes the mapping mechanism and loads CR[9], 

4, resets the display register bank and sets Dis-top to 1, 
and 

5. loads the program counter with the address of procedure 
obtained from entry point information (corresponding to 
the index in the capability) available in the header of 
the new packet. 

The wretvenseirom saladomaintback sto the calling domain 
takes place through the execution of one of the usual return 
instructions (RETURN or PRETURN). The difference between a 
usual return from a procedure and a domain switch is 
detected at the microcode level from an indicator in the AIR 
area associated with the activation record of the procedure 
executing the return. A domain return sequence performs the 
following in addition to the usual return sequence: 

1. resets the capability registers CR[1] to CR[9], 

2. executeS a sequence of load capability register 
operations according to the state of L indicators in the 
AIR of the returning domain frame, (note: only 
microcoded sequences are allowed to access information 
in two different domain frames) and 

3. reloads the display register bank and the Dis-top 
register with values stored in the AIR of the returning 


domain frame. 
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4.4 Some Special Issues in an Architecture for Ada 
Architectural support for efficient implementation of 
abstract data types and the details of the representation of 
the ‘packets objects wihlwrbesededltarwithesinn theenext two 
sections. In the following subsections, other architectural 
features will be presented that are specifically designed 


for efficient runtime representation of Ada programs. 


4.4.1 The Primitive Data Types 

The usefulness of a tagged memory representation was 
indicated earlier. This architecture uses short tags of 4 
bits, with every data word in the memory. The 
interpretations of the tag fields are as follows: 


Tag Type of information 


0000 Undefined word 

0001 A word representing real number 

0010 An integer word with pointer to a rangeword 
0011 An integer word with lower bound of 0 
0100 An integer word with lower bound of 1 
0101 An AIR word 

0170 An instruction word 

0113 A capability word 

1000 A rangeword 

1001 A word in packet header 

1010 A word in array descriptor 

£0 til A boolean word 


The rationale behind the above mentioned primitive types 
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will be discussed in the following sections. 


4.4.2 Support for Subranges and Constraint Checking 

Studies done on occurrence of variables in programs put 
array access at less than 15% whereas simple variable access 
accounts formmod%okoftitheintotal witTan78) tsRob76 lees nbthe 
pre-Pascal languages, accessing a simple variable was a 
Straightforward operation. However introduction of the 
concept of subranges demands additional architectural 
Support for constraint checking at run time. Most of the 
well known commercial architectures do not have any special 
Support for efficient constraint checking and usually 
generate significant amount of additional machine code for 
variable accesses to satisfy such requirements in the 
programming languages. This is definitely not acceptable if 
some convenient architectural features could easily improve 
the execution of programs in language like Ada or Pascal 
PBIS80,9 Hilsry: So it was decided to provide an efficient 
architectural support for constraint checking for integer 
variables and a reasonably good (better than say, VAX 11/780 
or IBM 370) support for real representations. Checking of 
Subrange iS synonymous to checking array indices using 
descriptors. 

The above observations lead to the definition of 
primitive data types with tags 0010, 0011 and 0100 for 
representing integers along with subrange information. 


Empirical studies have shown that zero and one tie for _ the 
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first place as lower bounds and together they cover’ the 
majority of subrange definitions for integer types [Bis 80]. 
So two primitive data types are defined for integer 
representations that implicitly encode the lower bound 
information in the representation. 

The format of the integer words with the tags 0010, 


0011 and 0100 are shown in figure 4.5. 


0 3 4 aan’ 34 63 
0100 or upper bound value 
0011 
tag 
form (a) 
0 3 4 ise 2 63 
form (b) 


Figure 4.5 Integer Words 


The tag 0100 implies that the lower bound is 1 and the 
upper bound is a positive value and is contained in the 
'upper bound' field (bits 4-33). Similarly tag 0011 implies 
that the lower bound is zero and the positive upper bound is 
contained in the upper bound field. These two formats are 
used for short integers with positive upper bounds. The 
form(b) is used for longer integers and for any integers 
that can not be represented by form(a) (i.e., when lower 


bounds are not zero or one and also when lower upper bounds 


bor’ i Tan ae Ae 


hy Rich an 

a 

i Ne 
es 


ila + 7 
i: As “ Di 7) i " 
co ORR eee oe rr 
sz zevoo Yad teteegos bas salem Geueh eeiecal 
{0 2f¢3 eeqys waphend G02 enol simeeeme 


‘eyezat 303° See AS aaqys> ade). 


NAL 


N 
_ A AR ema a) Ie mm Eg ge ll ~- ws 
~ : és ‘ 
LBs BAUGe" Serene 
: 
par eee as ale Ee ee BOYS eae 
? 
x} 
. 4 | 
z - aN | 
bs eS de SE | 
BDO OwW 


=m “< = 


tae 7 asi 
79 1S de. DUCE, Tevo,  3e> ans aaistige: Bor0 ret Poe a, 


° 7 s ies . ei S ‘ 5 
di4 ci Borietoos SR BAR Soler ave Sides? 's J 2) emnod se 
ay) 5 


ay 


asifoqmt 1160 ped @hyalini2 . (fea ae Biait ‘haved 19Qqu! 


2 SiHied -Yehqu- dyearede® eds Ane lie Sé Bowed 19061 ed spt 
\BLel bh Shue ‘teen. a3. al Sant dunes 


Sead 
- 
~ Tee -eH0uGe TeQctievid fea, ta iw 218P8I As hi 102 Beau 
bis 


is] 
© 
‘SB 
= 
* 


(apo. sot beeu wi daha 


SSRG. Mery SE) Shot Ye Deas se 960. ee sad ins 2 


@oeued iegqu scwel nad© cals Baguese 26 eyes S68 676 seanoe : 
; : > oe 
i 
ef 
_ . 5 wa 7 7 + ; = ‘= re @ 
i i a > 
- ee ¥ 7 


106 


are not positive). Thegrolisere field sinh theanaiorm(b) 
represents an offset in the descriptor template area ina 
packet, where a rangeword with tag 1000 is stored. The 


format of a rangeword is shown in figure 4.6. 


Figure 4.6 A rangeword. 


The upper and lower bound fields would contain values 
with sign information. A single rangeword could be used to 
represent the subrange information for many variables if 
they all have the same subrange constraint. It could be 
argued that the representation in form (b) introduces 
unnecessary waste of memory space. The use of the 
system-wide fixed word length concept always introduces’ the 
probkemyio& lack of \aproper (eutrlization efofuthe available 
memory space, but the use of fixed word length improves 
execution efficiency. If the usefulness of the fixed word 
length concept is assumed, then form (b) representation 
actually improves the space utilization in the sense that 
the space used for the range information would not have been 
used by a short integer anyway. 

The advantage of associating the subrange information 
directly with the variable is that it will not be necessary 
to fetch and execute separate machine instructions’ for 
loading range values and perform range check operations for 


every variable associated with subrange information. 
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Some extra hardware facilities are also proposed in 
this design to facilitate the often used operation of range 
checking. Two extra hardware registers UR and LR are 
provided in the ALU that would usually contain the upper and 
lower bounds of the variable representing the destination of 
the ALU result. The ALU would also include two separate 
comparators, for the purpose of comparing the ALU-0 to _ the 
values stored in UR and LR registers. A special flag RC in 
the ALU is used by the microcode for enabling and disabling 
automatic range checking. 

There are essentially two instructions in the proposed 
Ey enbeneure that deal with the special registers and the 
automatic range checking hardware. The LOAD RANGE (LR) 
instruction is used for real operands and the range is 
explicitly specified in the instruction. The LOAD RANGE 
instruction loads the explicity specified upper and lower 
bound values into UR and LR registers respectively and it 
also sets the RC flag. When the RC flag is set, the gating 
of the ALU output to the destination is delayed until the 
range check is automatically performed. The LOAD RANGE 
instruction is generated by the compiler ~ only when 
constraint checking on the ALU-output is necessary. 

The other instruction that invokes automatic’ range 
checking is the CSTORE (check before storing) instruction. 
The address of the destination location is specified in the 
CSTORE canstructions The microcode sequences to be used by 


the CSTORE instruction depend on the tag of the destination 
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location. In cases of 0100/0011 tags, the UR register is 
loaded from the upper bound field of the variable and 0 or 1 
is loaded into the LR register. In case of 0010 tag the 
register UR and LR are loaded from the location in the 
packet specified by the offset field of the destination 
Locations Automatic range check is performed before the 


storing operation. 


4.4.3 Support for Parameter Passing in Ada 
Implementation of the parameter passing convention in 

Ada has to deal with some new problems that were not present 

in earlier block-structured languages like Algol, Pascal or 

PL/1. The semantics for scalar parameters imply that 

1. Any range constraint on the formal parameter, for an IN 
or INOUT parameter, must be satisfied by the actual 
parameter at the beginning of the call. 

2. Any range constraint on the variable which is the actual 
parameter, for an INOUT or OUT parameter, must _ be 
Satisfied by the value of the formal parameter upon 
return from the subprogram. 

The above two conditions require constraint checking (with 

respect to the corresponding formal parameters) after 

evaluation ofe ithe sactuale IN@eor? INOUT ‘parameters * and 
constraint checking (with respect to the actual parameter) 
before the assignment of the formal OUT or INOUT parameters 
to the corresponding actual parameters. The problem lies in 


the fact that at the point of return the actual value of the 
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local variable (representing the formal) corresponding to 
OUT and INOUT parameters must lie within the range 
constraints of the actual parameter. Such a check cannot be 
performed within the procedure body as the range constraints 
of actual parameters can not be known to the procedure. 

None of the known architectures provide elegant 
solution fore tthis -probléme The proposed architecture 
provides an unconventional but efficient support to deal 
with this’ problem. Two new pairs of instructions are 
provided PBEGIN-PEND and PCALL-PRETURN. 

The PBEGIN-PEND pair works in the same way as the ENTRY 
end<chxite nstructrons. Tusedgatore tblocksarentrytcand exit 
(discussed in Chapter 3) except in that PBEGIN sets a flag 
(PB) at the microcode level. 

The PCALL instruction is used for calling procedures 
with parameters. A avyinctal Taeleckuais pabupbteraroundotthe 
procedure body and all parameter type checking operations 
are done in this virtual block. The entry to the virtual 
block is caused by the execution of the PBEGIN instruction. 
The parameters for the ensuing PCALL instruction is prepared 
in this virtual block and left on the process stack. There 
are some interesting problems related to the management of 
the display registers at the time of entry to the procedure 
body. It could be best explained through an example. 
Assume the following conditions: 

(a) the procedure is called from lexical level 4; and 


(b) the procedure identifier is declared at lexical level 2 
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and thus the new value in display register 3 would be 
pointing to the section of the stack segment that would be 
allocated to the procedure body after execution of the call 
mnstruction. 

At thet spointCaot calling; the variablew ‘aecessing 
environment is represented by the Display registers 1 to 4. 
The type checking for IN and INOUT parameters have to be 
performed in this environment as the actual parameters are 
accessible only in this environment. On execution of the 
call instruction, the variable accessing environment for the 
procedure body is represented by display registers 1, 2 and 
3(with the new value). Thus the actual parameters evaluated 
in the old environment would not be accessible in this new 
environment. A Similar situation occurs after execution of 
the return instruction. The type checking of the formal OUT 
and INOUT parameters against the actual parameters cannot be 
performed in the environment represented by the display 
registers 1, 2 and 3, as the actual OUT and INOUT parameters 
may not be accessible in this new environment. 

Thelintecodtiebion offmal¥vireual iblockeceneLosing the 
procedure body anda different way of handling-the Dis-top 
pointer by these new pairs of instructions solve the 
problem. On execution of PBEGIN a new block at level 5 is 
created. The constraints on the formal parameters in the 
procedure are known to the calling level at compile time. 
The code for constraint checking is included in the virtual 


block invoked by PBEGIN. It should be noted that the 
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variable accessing environment as available to the virtual 
block is the same as that of the calling level. So any 
variable declared in this environment could serve as an 
actual IN, INOUT or OUT parameter. The constraint check 
before executing PCALL is done for IN and INOUT actual 
parameters. After the check the actual IN or INOUT 
parameters are available on the stack at lexical level 5 of 
the calling environment. On execution of a call 
instruction, the variable accessing environment would change 
(as indicated earlier). The proposed mechanism meqtaiees 
that lexical level 5 of the calling environment be merged 
with the lexical level 3 of the new environment. This is 
achieved through the new instruction PCALL. Similarly, at 
the time of return, PRETURN would do the appropriate 
adjustments so that the procedure returns to the virtual 
block at lexical level 5 of the calling environment, but 
would still be able to access the formal parameters. The 
constraint checking for the OUT and INOUT formal parameters 
and assignment to the actual parameters are done by the code 
in the virtual block. Finally PEND is executed and the 
original calling environment is reestablished. S*A 
descriptions of the PCALL and PRETURN components for Rohl's 
mechanism (M2) are given below. The declarative part of the 
mechanism Mz is assumed. 

Proc Mz.PCALL 

StackhihSptrlasaakM #oFMG: sasptry 


Sptihyv:=eSptuad 1; 
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Stack. [eptr] :=Dis-top :; Sptr :2 Sptr + 1; 

Stack ([sptr Jietsmrctr = Sprres=)Sptr + «1; 

Dis-top := Inst-reg.Adr. lex 

Petr += Stack [Display [Dis-=too] + Inst-reg. offset] 

Sper := Sptr eet Dis=topeamw pis-top + 1: 

Stack [sptr] := Display [Dis-top] 

Sptr v=-Spey rt 1 = 

Display [Dis-top] := Display [Stack{[Sptr.+ 3]]; 

/* Assuming the AIR area to be same as described for 
Rohl's mechanism in chapter 3 */ 


end Proc 


Proc M;.PRETURN 

Sptr s:= FM ; FM ;:= Stack [sptr] ; 

Display [Dis-top] := Stack —iSptr + 3] ; 

/* the LINK is loaded from AIR #*/ 

Dis-top = "Stack. (Sptr tie 

Petr s= "Stacks (sptr + 2a. 
end Proc 

The parameter passing mechanism introduced in this 

section requires further modifications of the view of the 
Stack segment shown in figure 4.4. A view of the procedure 
activation record with parameters, after the execution of 
the PBEGIN-PCALL sequence, iS shown in figure 4.7. After 
execution of an usual CALL instruction, the Frame-mark (FM) 
and Display [Dis-top] would have pointed to the positions 


shown by (**) and (*) respectively, whereas, on execution 
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(Virtual block) 


Fig. 4.7: A Procedure Activation Record 
With Parameters 
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of a PCALL, a portion of the frame allocated to the virtual 
block gets included in the procedure activation record. It 
should be observed that the above mentioned modification do 
not cause any new protection problem as every word in the 
AIR area are tagged. The only modification necessary would 
be in the virtual address generated by the compiler. The 
offset parts of the (lexical level, offset) form of 
addresses for the local variables have to be modified to 
take care of the additional predefined offset introduced by 


the procedure AIR area. 


4.4.4 Support for Dynamic arrays and Discriminant records 

Arrays and vectors are represented in this architecture 
in the packed form [Tan76, BaC79] and are addressed using 
dope vectors. The records are Elattened out and each field 
is treated as a separate variable [Jew72]. Handling of 
dynamic arrayS and discriminant records deserve special 
considerations. The dynamic nature of discriminant records 
is due to the presence of a dynamic array as a field of the 
record. Hence discussion in this section will be restricted 
to representation and addressing of static and dynamic 
arrays. The rank of a dynamic array as well as the type of 
the elements are always known at compile time. The bound 
information for one or more dimensions could be specified at 
the execution time. 

The statically defined arrays and array field of a 


record are allocated on the activation stack. The elements 
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of a dynamic array or array component of a discriminant 
record are allocated in the heap space with a capability for 
the representation in the dope vector in the corresponding 
activation record in the process stack segment. 

The packed representation (in column major form) of a 
Static array A along with the form of the dope vector is 
shown in figure 4.8. The array shown in the figure 


corresponds to the array declaration: 


A: array (INTEGER range b;..u;,...,INTEGER range b,..u,) of 


INTEGER; 


where b, and uu, (i = 1..n) represent the lower and upper 
bounds of the dimension 1 respectively. The length 1; shown 


in the dope vector is - 


Lj pee 

The address of an array element is obtained by the 
instructions INDEX and _ SINDEX. Both (ithe instructions 
require the rank of the array, the subscripts of the element 
and the address of the pointer word in the dope vector to be 
available in a predefined sequence on the activation stack. 
The INDEX instruction does range checking for each and every 
Subscript before generating the address, whereas the SINDEX 
(Safe Index) instruction is generated by the compiler when 
the subscripts are known to be correct. Proper use of such 


an instruction could lead to significant reduction 
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Figure 4.8 : Array Representation 
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of execution time for array element addressing [BiB81]. 

Another interesting feature in the proposed design that 
facilitates representation of static and dynamic arrays is 
the dope vector template or descriptor template in a 
predefined space in the packet. It is possible to create a 
dope vector template in the packet object at compilation 
time as the rank of any array is always Statically 
determined. The dope vector template is similar in 
Structure to the corresponding dope vector that would be 
eventually created on the process stack. The difference 
between the template and the actual dope vector is that the 
template might have fields that cannot be specified at 
compile time and the pointer field is kept undefined. The 
dope vector template (or any other descriptor template) 
proves to be extremely useful for fast creation of dope 
vectors on the activation. stack. The dope vector frame 
along with the statically known fields could be copied on to 
the stack by a BLOCK MOVE (BMOVE) instruction. Three 
instructions LDV1, LDV2 and LDV3 are provided for easy 
loading of the three fields of a dope vector word with the 
contents available on top of the stack. 

Ada allows formal array parameters with one or more 
unconstrained dimensions. The dimensions derive their range 
constraints from the actual parameter. The mechanism of 
array parameter passing is also easily handled in the 
architecture by block copying (by BCOPY instruction) of the 


dope vector of the actual parameter onto the dope vector of 
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the formal parameter. The actual array is re-created on the 
heap space and the capability for that area is stored in the 
pointer space in the dope vector for that array. Eti iss 
important to note that the pointer space of the dope vector 
for a dynamic array alwayS contains a capability to the 
actual array representation in the heap space. The same is 


true for dynamic fields of discriminant records. 


4.5 The packet Object 

The packet is the most important object in this design 
as it-Wepresentstethe woutputeyofre thes) compilers foriuany 
compilable unit in Ada. The composition of the packet 
object proposed in this design appears to be quite similar 
to the module object in SWARD, at first glance. But a 
closer look reveals that they are quite different. The 
packet object is composed of the following areas: 

a. The header information area (HIA) contains a list of 
indexes to the starting point of different areas in 
the packet object. Mona sonicontainss.a Siast: + of 
indexes to protected entry points in the instruction 
space of the packet object. Any word a this area 
carries the tag 1001. 

b. An area OVS is allocated in the packet for storing 
own variables. The own variables of the scalar type 
exist in this space and the descriptor/dope vector 
templates for own variables of composite structure 


are stored in this_' space. Such variables are 
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created in the heap space and initialised by the 
code generated for package initialisation.*® When 
such an object is created in the heap, a capability 
for the object gets stored in the space allocated in 
the descriptor template (in the packet) for the 
object. 

c. A space in the packet object (DTS) is allocated to 
descriptor/dope vector templates for arrays and 
records used by the instructions in the packet. The 
corresponding arrays and records are created in the 
stack segment when they are declared in procedures 
or blocks, using the templates stored in this space. 
This space would also contain the rangewords 
(described in Section 4.4.2) and the constants. 

d. A space (CS) is reserved for storing capabilities to 
other packet objects. The capabilities stored in 
this space determine the authority to enter other 
packet objects. The Pphuse'*«tchauses ‘in «“Adacvare 
translated as CREATE PACKET instructions. Execution 
of CREATE PACKET instruction in the initialisation 
routine of the packet object creates a sere object 
named in the instruction and returns a capability 
for the object to the location in CS space specified 
inethe sinstruc ton ¢ 

e. Finally, the packet object has a space (IS) 


5 It should be noted that the notion of 'own' variable in 
Ada arises only in case of variables in Ada packages and a 
section of the code in the package is responsible for 
initialisation of such variables. 
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allocated for instructions. The instruction space 
would contain instructions for packet initialisation 
as well as instructions generated to represent Ada 
subprograms (procedures and functions). The index 
of entry points to various sections of the 
instruction space are stored in the HIA of the 
packet. In the case when a package in Ada is used 
for representing abstract data types, the IS of the 
packet would contain a section of the space 
allocated to instructions for creating an instance 
of the type. The entry point information for Such a 
section would be available in the HIA (cf. section 
ASG) s 
The format of a packet object representation in the 
packet space of memory is shown in figure 4.9(a). It 
contains a fixed length section of five words called the 
packet header. The details of the header information area 
are shown in figure 4.9(b). A restriction in this design is 
that a packet could have 16 protected entry points. The 
relative address (relative to the end of the HIA) of the 
protected entry points are stored in the peace EPI1 to 
EPI16 in the Packet header. The remaining word of the 
header contains indexes to the four area in the packet (OVS, 
DTS, CS and IS) described earlier. 
A packet object is created by execution of a CREATE 
PACKET instruction. The instruction is similar to _ the 


CREATE MODULE instruction in SWARD. When a process enters a 
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Figure 4.9(a): Representation of a Packet 


Figure 4.9(b): The Packet Header 
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a packet for execution, the capability register CR[9] gets 
loaded with the mapped capability for the packet object. 
Any address in the packet object is always evaluated with 
respectm ito’ CRE9]. Addresses, when specified in the 
instructions, are always relative to the beginning of the 
packet object. 

During activation of a procedure, a function or a block 
phethebpecketm theplocelescatarevarrablesisarencreatedsonm the 
activation stack by explicit instructions in the section of 
the IS containing code corresponding to the procedure, 
function or block. The descriptor templates or dope vectors 
are created on the activation stack by BMOVE instructions. 
The BMOVE instruction requires the starting and end of the 
source locations in the packet DCS space, the destination is 
automatically assumed to be starting from TOS. The majority 
of the instructions in the packet object would be stack 
oriented where the domain frame (corresponding to _ the 
packet) of the process stack segment is the implicit frame 
of reference. Thus the addresses of variables in the 
procedure, block or function are generated in the usual 


(lexical level, offset) form. 


4,6 Implementation of Abstract Data Types 
This section will deal with the support provided by the 
proposed architecture for implementation of Abstract data 


types(ADT). 
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When an Ada package defines an abstract data type, the 
Spcification section of the package would contain the 
details of the representation of the type within the Private 
or Limited Private type declaration. The packet 
representation of such a package would contain a protected 
procedure (called INSTANTIATE) for creating an instance of 
the type in the heap space. It should also be noted that 
the subprogram or package that wants to use the private or 
limited private type for abstract data typing must 
explicitly specify the package (defining the abstract type) 
through a USE clause. 

The mechanism of ADT implementation will be explained 
through an example. Let A be a packet object representing 
the package defining the abstract data type (say T). Let B 
be a packet representing a subprogram or package containing 
a declaration ofa variable V of type T (i.e.,) V:T). The 
packet B would also have a capability (say C) for packet A 
in CS area of B (corresponding to the use clause in the 
Subprogram or package and obtained by the CREATE PACKET 
instruction in the initialisation section of packet B). 

The! ®instructions. "ind packéetens corresponding to the 
élaborationwof ithe vedeciaratwonn ofV¥itherrformycV:T are *as 
follows. To start with, a CREATE PACKET instruction would 
be executed that creates another instance of packet A, say 
A(V). The capability for A(V) (say C) would be returned and 
stored in the DCS area of B corresponding to the variable V. 


Now the entry point index for the INSTANTIATE procedure 
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would be loaded on TOS and a COMPUTE CAPABILITY instruction 
would be executed with C as one of the operands. The 
eapabithaty nee with appropriate index in the index/offset 
field would get deposited on TOS, on successful execution of 
the COMPUTE CAPABILITY instruction. The capability on TOS 
will be treated as an operand for the ENTER instruction that 
follows. The execution of the ENTER instruction would cause 
a domain switch. Then the INSTANTIATE procedure in packet 
A(V) would execute to create an instance of the ADT in the 
heap space and return a capability for the instance that 
would get stored in DCS area allocated to T in A(V). 

To invoke any of the allowed operations on the variable 
V, two instructions are executed - 
1) a COMPUTE CAPABILITY instruction with C and the EP 
Index of the desired operations as the two operands, and 
2) an ENTER instruction with the capability on TOS (returned 
by the COMPUTE CAPABILITY instruction). 
It should be noted that the protected procedure representing 
the specified operation (for the ADT) in packet A(V) would 
act on the representation in the heap space. The 
representation is pointed to by the capability of the heap 
object stored in the DCS area in A(V) corresponding to _ the 
variable: @ir The above description shows how the packet 
objects and tagged capabilities allow a straightforward 


implementation of abstract data types. 
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Chapter 5 
The Instruction Set 

The instruction set of the proposed architecture is 
Summarized in this chapter. As indicated earlier this is a 
preliminary version of the instruction set and the 
instruction set 1S not complete without architectural 
Support for tasking and input/output. The semantics of the 
proposed instructions are explained without any details of 
the exact formats and encoding. 

The instructions are grouped according to commonality 
of functions. The groups are: 

at UStackigroup 

b. Capability group 

¢. "Branch group 

det €ontrotraoroup 

e. Array group 

f. Arithmetic-Logic group 

g. Miscellaneous 

Whenever necessary one or more S*A statements have been 


used for clarity of semantics. 


5.1 The Stack Group 

The common characteristic of this group of instructions 
is -that the process stack top’is implied to contain the 
source operand or the destination of the result of these 
instructions. The instructions in this group are further 
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1. L- Form: the address is specified, in the ‘lexical 
level, offset' form. The contents of CR[10],° the 
capability register associated with the stack segment is 
conSidered as the implicit base. 

2. C - Form: the address is interpreted relative to a 
capability register specified in the instruction. There 
are two subforms for this form of instructions. Ca: the 
offset is specified in the instruction word; Cb: the 
offset 1S assumed to be on TOS. 

3. P - Form: the address is interpreted relative to the top 
of the packet executing the instruction. The contents 
of CR[9] is the implicit base address for this form of 
instruction. 

4, S$ - Form: the address is computed from the capability 
Cof formev0iO sor '1008) vavartlable eitherson*the top of 
the stack (for load operations) or just below the data 
on the TOS (for store operations). 

The Form, of ‘the “Snstruction is indicated within 


parentheses in the mnemonic for the instruction. 


PUSH(L): The address of the operand is specified in the 
lexical level, «offset form: The semantics could be 
represented as: 

Stack [Sptr]: = Stack [DR[IR.LEX] + IR.offsett], 


where any address in the 'Stack' is computed relative to 
‘The CR[10].base is implied to be the base address for 
effective address calculation for all the instructions in 
the group. Similarly the check against CR[10].limit is 
always performed. 
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CR[10].base. The effective address generated is checked 
against CR[10].limit for the relation -- 
effective address < CR[10].base + CR[10].limit. 
POP(L): The operand on TOS is stored in the stack location 
specified in the Tanstruction:, The corresponding SA 
statement is: 

Stack [DR[IR.Lex]+IR.offset]: = Stack [Sptr]. 
PUSA(L): The address specified by the Lex.level, offset form 
is pushed on top of the stack. The S*A repreSentation is: 

Stack? [Sptrd s4=cDRAfTRthex1#+ BRiofftset: 
PUSI: The operand specified in the instruction is pushed on 
the TOS. 
PUSH(C): The operand address is computed relative to the 
‘base' of the capability register CR[i] (i=1...8) specified 
in the instruction. The ‘'offset' is specified in the 
instruction word for PUSH(Ca) and the offset is assumed to 
be on TOS for PUSH(Cb). The S*A representation for PUSH(Ca) 
se 

Stack [Sptr]: = Mem[CR(i).base + IR.offset]; 
POP(C): The operand on TOS is stored in the effective 
address computed relative to the base of the capability 
register CR[i] specified in the instruction. It has two 
forms similar to the PUSH(C) instruction. 
PUSH(P): The operand address is computed relative to the 
'base' of CR[9], the capability register associated with the 
packet executing the instruction. The operand is loaded on 


TOS. There is no corresponding POP(P) instruction. 
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PUSH(S): The operand address is computed from a capability 
of the form '010' or '100' available on TOS. The semantics 
einthhey tnstruct onguimplres gunvokings of, the, ecapabblity 
mapping mechanism to obtain the 'base', 'limit' information 
from the PMT. The corresponding POP operation is 
represented by POP(S). 

Three Operations that should be included in the 
instruction set are the increment, decrement and ‘store 
zero' operations [Tan78]. The instructions are as follows: 
INC(L): The operand address is specified by the lexical 
level, offset form of address specified in the instruction. 
The operand is expected to be an integer word. The operand 
is incremented by 1. The corresponding decrement operation 
is preformed by DEC(L) instruction. The INC and DEC 
operations are also available in form C. 

SsToz 6h ,C): The instruction visvavatlable in forms L and Cc. 
On execution of this instruction, the operand specified by 
the effective address is set to zero. 

CSTORE(L,C): (Check before storing): The instruction is 
available in both the forms. The address part of the 
instruction specifies the destination address for the 
ALU=outpuUE: The semantics of the instruction ensures 
checking of the ALU-output against the upper and lower range 
information available in the destination location. (The 
instruction was explained in Chapter 4). 

LDVI, LDV2, LDV3: These three instructions are provided for 


loading of the three fields of the dope vector word. The 
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offset is only specified in the instruction. The effective 
address of the dope vector word is computed relative to the 
contents of DR[Dis-top]. The operand is considered to be on 
GAEPTOS VOCEDV 1, 4+LDV2"¥and SeLDVS¥eloads” the) "*upper bound”, 
‘lower bound' and the '‘'length' fields of the dope vector 


word respectively. 


BMOVE(P) OP1, OP2: (Block Move) The instruction is provided 
to move a block of words from the DCS area of the packet 
(executing the instruction) to the area on the_= stack 
segment, starting from the location pointed by the Sptr just 
prior to execution of this instruction. OP1 specifies the 
index in the DCS area and OP2 specifies the length of the 
block in words. The capability register CR[9] is implied in 
computing the effective address in the DCS area of the 


packet. 


BGOPY “in, VOPies OP2:), GUBlcekmeapy,) The) “instruction is 
specifically meant for copying dope vectors between two 
different visible lexical levels. OP1 and OP2 represent the 
addresses of the source and destination gps vectors 
respectively. The addresses are specified in lexical level, 
offset forms. The parameter ln stands for the length of the 
dope vector in words. The instruction requires that the 
source and the destination locations have tags of '1010'. 
The instruction would only be used for passing array 


parameters. (The use of this instruction was further 
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explained in Chapter 4). 


ALLOC 1n: This instruction is used to create objects in the 
heap space. It involves invocation of the heap manager. 
The structure to be created on the heap space is built on 
the temporary area of the latest activation record. The 
length information ln present in the instruction specifies 
the actual length of the representation in words. The heap 
manager would allocate the necessary space in the heap. The 
Structure 1S copied onto the heap space and deallocated from 
the stack. The capability of form '001' is returned on TOS. 
A similar instruction 1S PALLOC ln, OP1i. The difference 
between the two instructions is that the capability returned 
is stored in the DCS space of the packet object. The 
operand OP1 represents the offset in the DCS area and points 
to the capability space reserved in the template for the own 


variable under :consideration: 


5.2 The Capability Group 

The instructions in this group are used for loading, 
storing capability registers and computing new capabilities 
ofethedfiorms '0104)5'01i' Samnd@e00Ghatromtavcapabiluty dof ithe 
Eormpyl0Ohl a Am Instructionetormmoditying ‘authority’ field 
of ca icapabilityers also included in this ‘group. 
LOR i¢ (hoaduGapability regustermmcr (1) for ( i=1...8))2. The 
instruction expects a capability word on the TOS. To start 


with, «the capability son fOS esis) stored in the CRB jarea 
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corresponding to CR(i) of the latest domain frame. Then the 
capability mapping mechanism is invoked. The mapped 
information is loaded into CR(i) along with the ‘authority' 
BLebatot ithe toniginalereapabilibyirand tthe 'L bite invothe 
register is set. 

SCRmn: (Store Capabilmey *registerScR (i) (forti=teio8)). The 
instruction loads the content of the CRB area, corresponding 
to CR(i) insthe latest sdomain frame; and resets the L*bit in 
ERCL)4 When the L bit in a capability register is not set, 
the content is ay. Raa 

CCAP(L) OP1: (Compute Capability) In the L-form of the 
instruction, OP1 represents an address in the _ present 
domain. The instruction expects a capability of the form 
"001' in ithat *addréss.:: (If thevauthority field shas. ‘enter’ 
Eight, ‘vwawecapability "ot formeerG00l onistrproditi¢ed iand the 
‘index' field is loaded with the index from the TOS. If the 
authority field has other rights, then a capability of form 
'010' is produced with ‘offset' field loaded from the TOS. 
Finally the capability 1s deposited on the TOS. 

CCAP(P) OPi: The instruction is similar to the previous. one 
except in that the effective address computed using OP1 
represents a location in DCS/CS space having the desired 


capability. 
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5.3 The Branch Group 

The instructions in the Branch group are used to branch 
within the instruction space of the packet. There are 
essentially two classes of instructions in this group: 
conditional branch and unconditional branch. For the 
mnemonics of the conditional branch instructions, the 
conditions are specified within parentheses. 

The “ftollowing*rinsteuections® are in® the class of 
conditional branch instruction. 
CBRZ(=) OP1: The data on the TOS is an implicit operand. 
This data iS compared for equality with zero. If the 
condition is satisfied, the program counter iS incremented 
by the number of words specified in OP1. For any branch 
instruction, the effective branch address is checked against 
the 'base' = bameee information from CR[Q9]. The 
instructions’ CBRZ(s)¥ CBRZ¢<);)°CBRZ(>)) CBRZ(2): and .CBR2(#) 
are Similar to the instruction explained above with 
different conditions for checking. 
CBR(>) OP1: This conditional branch instruction assumes the 
contents of the top two locations of the stack as the two 
operands to be compared. tiprthe* contents of the TOS is 
greater than the operand below it, the branch takes place. 
Similar instruction for the other five conditions are 
available. These instructions are to be used £or 
implementing IF and WHILE statements. 
LOOP OP1 OP2: This instruction is provided for conditional 


looping over a set of instructions. The loop limit (i.e., 
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TOpipartttof fa’ correspondingetFOR Sinstruction “in “Ada) is 
evaluated and pushed onto the TOS before executing this 
instruction. The offset of the loop variable in the stack 
is provided as the operand OP1, the forward branching offset 
in the IS of the packet is specified in OP2. The branch is 
executed if the loop variable equals the value in the TOS. 
A similar instruction is LOOP where reverse checking is 
done for the loop variable and the loop limit. 

The only other branch instruction is the unconditional 


branch. The mnemonic for the instruction is BR. 


5.4 The Control Group 

This group includes instructions for invocation of 
blocks, procedures and domain switching. All these 
instructions have been explained earlier in Chapters 3 and 


4, A brief summary will be provided in this section. 


CALL OP1: The instruction is always in form-L. The operand 
in the instruction is the address of the procedure 
identifier in lexical level, offset form. The procedure 
identifier is stored in the process stack at the lexical 
level of declaration of the procedure. The S*A description 
for the “Wnstruction: ‘was tiprovirded: ‘in “Chapter ‘3 "as! the 
procedure M2.CALL. 

RETN: The return instruction causes return from a procedure 
to the instruction in the packet sequentially next to the 


CALL ‘instruction, The ‘Semantucs: "of “the instructionm is 
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represented by the S*A description M2z.RETURN provided in 
Chapter :3. 

PCALL OP1: The instruction is designed for calling procedure 
with parameters. The instruction would get executed only if 
the PB flag in the machine is set. The semantics of the 
instruction has already been described as the S*A 
description M,.PCALL in Chapter 4. 

PRETN: The instruction would be used for returning from a 
procedure called through a PCALL instruction. The S*A 
description for this instruction wasS given aS M2z.PRETURN in 
Chapter 4. 

FRETN: The instruction would be used for returning from a 
Function’ ica lle The semantics iS Similar to RETN with a 
difference that before execution of FRETN the result of the 
functwonsecall@pices left (in the 1 TOS’. (specifically in the 
logically top-most register among the four registers 
representing the stack top). Any instruction following the 
FRETN would be able to obtain the result from the hardware 
register though the block representing the function body 
gets deallocatedis Simpletscalan wesuitse aye savailables,in 
the top-most register. For composite structures, the 
structure is allocated in the heap space before executing 
FRETN. The FRETN instruction loads the capability returned 
on the top of the stack to the logically top-most register 
in the group of four registers representing the top of stack 


for the following instruction. 
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The S*A descriptions for the instructions for block 
invocation and exit - ENTRY, EXIT, PBEGIN,PEND have already 
been provided in Chapters 3 and 4 (for mechanism M2). 

ENTER: This is a zero address instruction for domain 
Switching. The instruction expects a capability of the form 
"000" with 'enter' authority. The execution of the process 
is transferred to the packet specified by the capability and 
the jastruction execution starts at the entry point 
specified in the ‘'index' field of the capability. The 
instruction causes creation of a new domain frame on the 
process stack and the lexical level of the entered procedure 
is considered as 1. The details of the semantics of this 


complex instruction was provided in Chapter 4. 


5.5 The Array Group 
Three instructions are proposed in this thesis’ that 


deal with array structures. 


DAS OP1, OP2: (Declare array space) The instruction is meant 
for creating the space for the array on the temporary area 
of the activation record and returns the base address of the 
Space to the array base field in the dope vector. OP! 
specifies the base address of the dope vector in _ lexical 
level, offset form and OP2 specifies the rank of the array. 

INDEX OP1: The instruction computes the address of the 
actual array element using the dope vector address and 


Subscripts available in sequence on the stack. The address 
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is returned on the TOS after removal of the subscripts and 
the dope vector address from the stack. The instruction 
checks each and every subscript against the lower and upper 
bounds specified in the dope vector for the corresponding 
dimension. The operand OP1 specifies rank of the array. 

SINDEX OP1: (Safe Index): The instruction is similar to the 
INDEX inStruction but subscript bound-checking is not 


performed. 


5.6 The Arithmetic Logic Group 

The tagged memory representation allows generic 
arithmetic operations. The arithmetic instructions operate 
on the top two elements of the stack. The operands are 
popped off the stack and the result is pushed on the _ TOS. 
The instructions are ADD, SUB, MULT, DIV and NEG (changes 
the sign of the operand). 

All the standard logical operations are available. The 
instructions-are AND; OR, 2AOR aeeCOMP, ~GT, “LT, EQ, “LSHFT, 


RSHET.. 


5.7 The Miscellaneous Group 
The instructions in this group are: EXCH, DUP, CI and 


CR. 


EXCH : (Exchange) The top two elements of the stack are 
exchanged by this operation. . 


DUP : (Duplicate) The contents of TOS is duplicated and 
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pushed on to the stack. 

or : (Convert to integer) A real operand on TOS is 
converted to an integer representation. 
The result replaces the operand. 

CR : (Convert to real) The integer operand is converted 
to real and the result replaces the 


operand. 
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Chapter 6 
Conclusion 

This thesis demonstrates how the architectural concepts 
of capability based addressing and stack oriented 
instruction sets may be combined to yield an execution 
environment congenial to execution of Ada programs. Two of 
the principal contributions of the new language Ada are its 
features for representing the concepts of abstract data 
types and information hiding. None of the commercially 
successful architectures provide any architectural features 
that support implementation of such advanced programming 
language concepts. On the other hand some of the 
architectures, (e.g., IAPX 432, IBM SWARD, IBM System 38) do 
provide architectural features coy supporting these 
concepts, but the complexity of the mechanisms are far 
beyond the necessities of an essentially compile time bound 
language environment (like Ada). 

One of the prime objectives of this research was to 
design an architecture that supports efficient execution of 
Ada programs and allows efficient implementation of the 
Packages and abstract data types in Ada. The stepwise 
systematic development of the architecture in the thesis was 
provided to demonstrate that the majority Acof tthe 
architectural features were introduced with the objective of 
providing efficient execution support for Ada programs. It 
should be obvious from the thesis that no unnecessarily 


complex mechanisms were introduced just for the sake of 
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innovation. The capability based addressing was primarily 
adopted as the basic addressing mechanism to provide elegant 
Support for implementing packages and abstract data types in 
Ada. The commonality of the objective for the design of the 
language Ada and the capability architecture for supporting 
development of large scale reliable software was indicated 
in Chapter 1. 

The importance of efficient architectural support for 
variable addressing mechanism in a block structured language 
environment was established in Chapter 3. A major 
contribution of this thesis is the proposal of a methodology 
bon choosing the implementation technique for 
exo-architectural components of a language directed 
architecture. Two new complexity measures were proposed 
that serve as a basis for this methodology. A new result 
was derived (in Chapter 3), using the complexity measures, 
that indicated the superiority of Rohl's mechanism over’ the 
other three mechanisms for variable addressing in Ada. The 
basic architectural support for variable addressing in the 
proposed architecture was designed using this methodology. 

Instead of proposing yet another pe aee eeecea 
architecture for Algol-like languages, this thesis has 
concentrated on some of the special features of Ada _ that 
require efficient run time support. 

The semantics of the parameter mechanism in Ada is 
quite different from the previous languages in its 


requirement for run time parameter type checking. The 
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implementation of thes mechanism on conventional 
architectures tend to become quite inefficient. The 
examination of an effort to develop an experimental Ada 
compiler on VAX 11/780 at CMU emphasizes the point [SeH80]. 
The provision of the new instructions (PBEGIN, PCALL, PEND 
and PRTN) in this architecture practically eliminates the 
problem. 

Similarly introduction of new primitive data types, 
hardware support and special instructions CSTORE and LOAD 
RANGE significantly reduce the execution time overhead for 
dynamic constraint checking. The proposed architecture also 
provides support for representation and handling of dynamic 
arrays and discriminant records in Ada. 

Myers has indicated that the IBM SWARD machine could be 
considered to be directed towards Ada [Mye82]. A _ study of 
the architecture reveals that a module object can represent 
a module in MODULA, but the architecture has no. facilities 
for! abstractwwdatas typing maser inosAda) Similarly the 
architecture has no equivalent concept of lexical level 
addressing in Ada. The architecture was not designed to 
provide any support for addressing free Get ieu rest thus sat 
can not provide any support for subroutine management in 
Ada. Moreover the architecture uses the long tag approach 
and is specifically designed to support execution time bound 
languages. The execution of programs written in a_ strongly 
typed language like Ada will be unnecessarily slow on this 


architecture. 
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An architecture that is heavily publicized as a 
high-level Ada machine is the IAPX 432 from Intel. The 
characterization is inaccurate in the sense that it has’ few 
direct relationships to Ada [Mye82]. The architecture could 
be better characterized as an ‘operating system machine'. 
This architecture is essentially a representation of the 
HYDRA operating system on silicon [LEV81]. It has excellent 
features for operating system support for memory management, 
process synchronization, scheduling, etc. The architecture 
has adopted the partitioned memory approach to capability 
architecture design. Hence, domain switching is not very 
efficient. Moveover, every CALL in this architecture is a 
domain switching operation. There are no capability 
registers for address translation, though it provides onchip 
associative memories 1e1Obs address translation. The 
architecture does not provide any architectural support for 
free variable addressing and support for automatic 


Subroutine management (as CALL or PCALL instruction in this 


design). 
As indicated earlier the proposed design in not 
complete in the true sense. Phew instruction =set. 1S not 


precisely defined in terms of formats and encoding of the 
fields. Moreover architectural supports for tasking and 


exception handling in Ada were not considered. 
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